> Furthermore, we observe that input tokens consistently constitute the largest share of consumpt...

bob1029 • today at 8:59 AM • 2 replies • view on HN

> Furthermore, we observe that input tokens consistently constitute the largest share of consumption for an average of 53.9%

I'm seeing a ratio of around 10:1 in my usage. A vast majority of the tokens consumed are on the input side. The agent will often read a million tokens just to patch one line of code.

I think if you are seeing something closer to 1:1 or more on the output side, there is either a problem with the agent or the codebase is new / empty.

Replies

zozbot234 • today at 9:20 AM

If input tokens dominate the cost to that extent, this implies that major gains are possible by making better use of caching. You could basically ask the model to do a one-time "compaction" step including a dump of the relevant portions of the code, and use that as the cached prefix for a large amount of "swarm" subagent calls.

kolinko • today at 9:29 AM

Did you experiment with giving agent better tools to navigate and document the codebase? Asts, language servers and so on?

A million tokens (not cached) sounds like a lot.

➕ show 1 reply

alt Hacker News

Replies