What do you mean? Costs spiked with the introduction of the 1M context window I believe due to large...

solidasparagus • today at 6:08 PM • 1 reply • view on HN

What do you mean? Costs spiked with the introduction of the 1M context window I believe due to larger average cached input tokens, which dominate cost.

Replies

TomGarden • today at 8:19 PM

Nah, there's apparently a few caching bugs, one --resume and some noisy tool use. I have a little app that monitors and resets the context window at 70% usage based on 200k tokens and I'm about to run out of weekly allowance after just a couple days. Never happened before

alt Hacker News

Replies