logoalt Hacker News

naaskingyesterday at 11:18 PM0 repliesview on HN

I agree LLMs are inefficient, but I don't think they are as inefficient as you imply. Human brains use a lot less power sure, but they're also a lot slower and worse at parallelism. An LLM can write an essay in a few minutes that would take a human days. If you aggregate all the power used by the human you're looking at kWh, much higher than the LLM used (an order of magnitude higher or more). And this doesn't even consider batch parallelism, which can further reduce power use per request.

But I do think that there is further underlying structure that can be exploited. A lot of recent work on geometric and latent interpretations of reasoning, geometric approaches to accelerate grokking, and as linear replacements for attention are promising directions, and multimodal training will further improve semantic synthesis.