I don't think input/output pricing matters, 90% of the cost is cache. $0.15 is pretty good...

himata4113 • yesterday at 6:07 PM • 4 replies • view on HN

I don't think input/output pricing matters, 90% of the cost is cache. $0.15 is pretty good, but still very expensive.

Replies

It depends on the use-case. yes, 90% of cost is cache in agentic coding scenarios (actually 95% in my experience). But not when the model reasons for 200k+ tokens before answering a complex problem.

➕ show 1 reply

simonw • yesterday at 7:24 PM

Gemini caching is confusing though:

  $0.15 / million tokens
  $1.00 / 1,000,000 tokens per hour (storage price)

I much prefer the OpenAI/DeepSeek way of pricing caching where you don't have to think about storage price at all - you pay for cached tokens if you reuse the same prefix within a (loosely defined) time period.

➕ show 1 reply

__jl__ • yesterday at 6:19 PM

In our experience, caching is not very reliable with google. We always get random cache misses that don't happen with other providers. We find OpenAI, Anthropic and Fireworks (which we use a lot) all have higher cache hit rates. So it's not only about the costs of cached token but also what kind of cached hit rate you get.

➕ show 1 reply

minimaxir • yesterday at 6:07 PM

10% of input pricing is standard especially compared to competition.

➕ show 1 reply

alt Hacker News

Replies