The way coding agent work is fantastically wasteful. All the megabytes of code are processed over an...

killerstorm • yesterday at 5:55 PM • 2 replies • view on HN

The way coding agent work is fantastically wasteful. All the megabytes of code are processed over and over and over, sometimes withing just one session.

There are papers describing KV cache precomputation for commonly used documents (e.g. KVLink), but, of course, it's not a priority for model providers: they'd rather sell you more tokens, also they would rather get to AGI/ASI first than optimize usage of existing models...

Replies

brookst • yesterday at 7:45 PM

Claude code gets >98% KV cache hits. It’s not reprocessing unless you let the cache go cold (5 minutes, which is annoyingly short).

➕ show 3 replies

alt Hacker News

Replies