logoalt Hacker News

bchernylast Sunday at 3:19 PM4 repliesview on HN

Claude Code is the most prompt cache-efficient harness, I think. The issue is more that the larger the context window, the higher the cost of a cache miss.


Replies

simslalast Sunday at 7:13 PM

I do wonder if it's fair to expect users to absorb cache miss costs when using Claude Code given how untransparent these are.

beacon294last Sunday at 4:22 PM

Politely, no.

- I wrote an extension in Pi to warm my cache with a heartbeat.

- I wrote another to block submission after the cache expired (heartbeats disabled or run out)

- I wrote a third to hard limit my context window.

- I wrote a fourth to handle cache control placement before forking context for fan out.

- my initial prompt was 1000 tokens, improving cache efficiency.

Anthropic is STOMPING on the diversity of use cases of their universal tool, see you when you recover.

yummytummylast Sunday at 3:33 PM

That might be, but the argument was that poor cache utilization was costing Anthropic too much money in other harnesses. If cache is considered in rate limits, it doesn’t matter from a cost perspective, you’ll just hit your rate limits faster in other harnesses that don’t try to cache optimize.

show 1 reply
eastboundlast Sunday at 3:48 PM

I’m sorry but when you wake up in the morning with 12% of your session used, saying “it’s the cache” is not an appropriate answer.

And I’m using Claude on a small module in my project, the automations that read more to take up more context are a scam.