logoalt Hacker News

himata4113yesterday at 6:07 PM4 repliesview on HN

I don't think input/output pricing matters, 90% of the cost is cache. $0.15 is pretty good, but still very expensive.


Replies

wolttamyesterday at 6:17 PM

It depends on the use-case. yes, 90% of cost is cache in agentic coding scenarios (actually 95% in my experience). But not when the model reasons for 200k+ tokens before answering a complex problem.

show 1 reply
simonwyesterday at 7:24 PM

Gemini caching is confusing though:

  $0.15 / million tokens
  $1.00 / 1,000,000 tokens per hour (storage price)
I much prefer the OpenAI/DeepSeek way of pricing caching where you don't have to think about storage price at all - you pay for cached tokens if you reuse the same prefix within a (loosely defined) time period.
show 1 reply
__jl__yesterday at 6:19 PM

In our experience, caching is not very reliable with google. We always get random cache misses that don't happen with other providers. We find OpenAI, Anthropic and Fireworks (which we use a lot) all have higher cache hit rates. So it's not only about the costs of cached token but also what kind of cached hit rate you get.

show 1 reply
minimaxiryesterday at 6:07 PM

10% of input pricing is standard especially compared to competition.

show 1 reply