logoalt Hacker News

_blkyesterday at 6:05 PM5 repliesview on HN

From what I understand you shouldn't wait more than 5min between prompts without compacting or clearing or you'll pay for reinitializing the cache. With compaction you still pay but it's less input tokens. (Is compaction itself free?)


Replies

gck1yesterday at 8:20 PM

Cache ttl on max subscriptions is 1h, FYI.

show 2 replies
krackersyesterday at 10:03 PM

>pay for reinitializing the cache

Why can't they save the kv cache to disk then later reload it to memory?

show 2 replies
conceptionyesterday at 6:39 PM

Yeah the caching change is probably 90% of “i run out of usage so fast now!” Issues.

hgoelyesterday at 6:15 PM

Ah I can see how my phrasing might be misleading, but these prompts were made within 5 minutes of each other, the timing I mentioned were what Claude spent working.

truenoyesterday at 8:01 PM

is it 5 mins between constant prompting/work or 5 mins as in if i step away from the comp for 5 mins and comp back and prompt again im not subject to reinit?

if it's the latter that's crazy. i dont even know what to do there, compactions already feel like a memory wipe