logoalt Hacker News

qeternityyesterday at 1:54 PM1 replyview on HN

Number of parameters is at least a proxy for model capability.

You can achieve incredible tok/dollar or tok/sec with Qwen3 0.6b.

It just won't be very good for most use cases.


Replies

janalsncmyesterday at 5:19 PM

Model capability is the other axis on their chart. So they could have put Qwen 0.6b there, it would be in the bottom right corner.

I know what they are trying to do. They are attempting show a kind of pareto frontier but it’s a little awkward.