logoalt Hacker News

benbencodesyesterday at 6:20 PM6 repliesview on HN

Pricing is now live on ai.google.dev/pricing:

Gemini 3.5 Flash: $0.75 input / $4.50 output per 1M tokens, 1M context window. The output price explicitly "includes thinking tokens" — which is why it's higher than a typical flash-class model.

For comparison within the Gemini lineup: - Gemini 2.5 Flash: $0.30 / $2.50 - Gemini 3.1 Flash-Lite: $0.25 / $1.50 - Gemini 3.1 Pro Preview: $2.00 / $12.00

So 3.5 Flash is ~2.5x more expensive input vs 2.5 Flash. The pricing and "including thinking tokens" framing position it as a reasoning-capable flash model rather than just a pure speed optimization.


Replies

lyjackalyesterday at 6:29 PM

You’re quoting the batch pricing. On demand is 1.5 per input and 9 per M output. This is effectively comparable cost to Gemini 2.5 Pro in a flash tier model

conorhyesterday at 6:26 PM

I think you have your pricing wrong there, Gemini 3.5 flash is $1.50 input and $9 output.

show 1 reply
ls_statsyesterday at 6:32 PM

You are seeing batch inference, standard inference is $1.5/$9. I was excited until I saw that price.

jpauyesterday at 6:26 PM

Standard pricing is showing for me as $1.50 / $9.

(I suspect you're viewing the "flex" pricing).

Tiberiumyesterday at 6:46 PM

Please delete/edit your AI-written and factually wrong post.

MallocVoidstaryesterday at 7:40 PM

In addition to people pointing out your LLM got the pricing wrong,

> The pricing and "including thinking tokens" framing position it as a reasoning-capable flash model rather than just a pure speed optimization

Every Gemini model starting with 2.5 has been a reasoning model.