I feel like what is needed is not compression, but aggressive context management with subagents.

SubiculumCode • yesterday at 5:54 PM • 4 replies • view on HN

Replies

so burn more tokens to save more tokens, so that we can spend more on X token but save on Y tokens?

not the question is which X tokens and which Y tokens? and since the output is non-deterministic how do you validate this?

LLMs aren't random and that enforces something that people are too dumb to realize that random-ness could be normally distributed but LLMs have no reason to be normally distributed or follow any sort of curve of understanding.

They are non-deterministic but with bias so their output might be just be worse with T' transformation for the class of problems A is solving but work great for B. or vice versa.

You can't reproducibly test LLMs and that allows all sorts of benchmarks to exist which can make any model look good or bad as much as we want. Enlightening stuff.

Not much different from sociological or psychological sciences where with enough bias in data you can prove anything.

lackoftactics • yesterday at 5:57 PM

I am the author the text.

What do you mean by aggresive context management with subagents? Would you add a lopp that would trim the context?

Both of those tasks seem even more difficult

➕ show 2 replies

arcanemachiner • yesterday at 6:09 PM

Use the right tool for the job.

If you need a piece of information that is buried somewhere, or a high-level summary/distillation of a larger body of info, then subagents may be the right tool for the job.

If you need all the gathered context for later use (i.e. distilled context is insufficient), then subagents probably are not the right tool for the job.

➕ show 1 reply

nullbio • today at 1:44 AM

I feel like what is needed is more local tool usage and small local model usage that does the heavy lifting, rather than the paid for LLM burning tokens at all.

alt Hacker News

Replies