logoalt Hacker News

segmondyyesterday at 5:58 PM3 repliesview on HN

you do realize claude opus/gpt5 are probably like 1000B-2000B models? So trying to have a model that's < 60B offer the same level of performance will be a miracle...


Replies

jropyesterday at 6:10 PM

I don't buy this. I've long wondered if the larger models, while exhibiting more useful knowledge, are not more wasteful as we greedily explore the frontier of "bigger is getting us better results, make it bigger". Qwen3-Coder-Next seems to be a point for that thought: we need to spend some time exploring what smaller models are capable of.

Perhaps I'm grossly wrong -- I guess time will tell.

show 2 replies
regularfryyesterday at 10:14 PM

There is (must be - information theory) a size/capacity efficiency frontier. There is no particular reason to think we're anywhere near it right now.

epolanskiyesterday at 11:33 PM

Aren't both latest opus and sonnet smaller than the previous versions?