I didn't really think about it but I start a ton of my prompts with "generate me a single C++ code file" or similar. There's always 2-3 paragraphs of prose in there. Why is it consuming output tokens on generating prose? I just wanted the code.
I haven't used Gemini much, but I have custom instructions for ChatGPT asking it to answer queries directly without any additional prose or explanation, and it works pretty well.
Didn't expect c++ code generation to be as bad as recipe websites.