logoalt Hacker News

solidasparagustoday at 6:08 PM1 replyview on HN

What do you mean? Costs spiked with the introduction of the 1M context window I believe due to larger average cached input tokens, which dominate cost.


Replies

TomGardentoday at 8:19 PM

Nah, there's apparently a few caching bugs, one --resume and some noisy tool use. I have a little app that monitors and resets the context window at 70% usage based on 200k tokens and I'm about to run out of weekly allowance after just a couple days. Never happened before