From what I understand you shouldn't wait more than 5min between prompts without compacting or clearing or you'll pay for reinitializing the cache. With compaction you still pay but it's less input tokens. (Is compaction itself free?)
>pay for reinitializing the cache
Why can't they save the kv cache to disk then later reload it to memory?
Yeah the caching change is probably 90% of “i run out of usage so fast now!” Issues.
Ah I can see how my phrasing might be misleading, but these prompts were made within 5 minutes of each other, the timing I mentioned were what Claude spent working.
is it 5 mins between constant prompting/work or 5 mins as in if i step away from the comp for 5 mins and comp back and prompt again im not subject to reinit?
if it's the latter that's crazy. i dont even know what to do there, compactions already feel like a memory wipe
Cache ttl on max subscriptions is 1h, FYI.