logoalt Hacker News

redhaleyesterday at 11:55 PM1 replyview on HN

I feel like caching should be mentioned in tradeoffs, right? If you change the tool list frequently, that's a cache bust. In long sessions that seems like it could significantly affect costs.


Replies

azurewraithtoday at 12:18 AM

Great question... and there are two answers depending on what you were originally referring to:

re: Claude Code... we actually don't filter or modify the tool list so all tools stay visible -- disallowed calls get blocked at execution time with an error message. No cache busts on transitions, the model sees the full tool sets. The cost there is prompt caching dollars not latency I suppose

re: The research (Rust agent + Ollama) the model only receives tool schemas for the current states' allowed tools. Ollama does have a KV cache reuse facility so changing the tool list busts that cache. Depending on your workflow this can happen as many times as you expect your states to transition until completion. For simple workflows this is 3-5x. Within each state the tool list is stable and cache operates normally. Presenting fewer tools instead of dozens on every agent processing step reduces input tokens and decision complexity, which is where the measurable gains come from.

Both enforce the same constraints depending on the execution interface. The schema level filtering in the research is the S-tier approach. Adding tools/list filtering to the MCP gateway would be beneficial if possible (it looks like we could only filter MCP tools not core ones, which could provide tangible benefit. I've added this evaluation to the roadmap.

show 1 reply