logoalt Hacker News

halJordanyesterday at 7:57 PM1 replyview on HN

Qwen isn't directing the forward progress of llms. SOTA llms have been moe since gpt-4. The og 4.

Out of context, but i honestly hate how HN let itself get so far behind the times that this is the sort of inane commentary we get on AI.


Replies

refulgentisyesterday at 9:17 PM

I would venture to suggest that to read it as "Qwen made MoEs in toto || first || better than anyone else" is reductive - merely, the # of experts and #s here are quite novel (70b...inferencing only 3b!?!) - I sometimes kick around the same take, but, thought I'd stand up for this. And I know what I'm talking about, I maintain a client that wraps llama.cpp x ~20 models on inference APIs