logoalt Hacker News

kmeisthaxyesterday at 2:37 PM2 repliesview on HN

I thought we were going to hit token saturation years ago, but they keep inventing new ways to use tokens. Like, instead of asking a chat model to write something and getting ~1000 tokens out of it, you now have an agent producing ~10,000 tokens - or, worse, spawning 10 subagents that collectively burn ~100,000 tokens. All for marginally better answers with significantly higher compute usage.

Personally, I would have used all those tokens to generate synthetic data for IDA (iterated distillation and amplification) so that the more efficient 1000 token/answer chat model can answer more questions, but apparently that doesn't justify an insane datacenter buildout.


Replies

azinman2yesterday at 3:46 PM

Everyone is interested in using less tokens to accomplish the same task.

user34283yesterday at 3:58 PM

Marginally better answers?

Claude Code and co. can now analyze an enterprise codebase to debug issues in a system with multiple services involved.

I don't see how that would have been possible at all in the past.