And yet 300+140=460. A very jagged surface indeed. https://gemini.google.com/share/c2a187275e26
Was that part of a bigger prompt?
Flash 3.5 fails exactly like in your sample: https://gemini.google.com/share/97521a8752d9
but Flash 3.1 Lite initially fails, but then corrects itself: https://gemini.google.com/share/dc0889ec85ba
Why would you use an LLM for this? They are non deterministic models.
This is also an probably part of extended prompt that disallowed coding, Gemini always does calculation with a little python snippet because it is deterministic and accurate.