logoalt Hacker News

jugyesterday at 3:42 PM1 replyview on HN

> The Qwen3-Next-80B-A3B-Instruct performs comparably to our flagship model Qwen3-235B-A22B-Instruct-2507

I'm skeptical about these claims. How can this be? Wouldn't there be massive loss of world knowledge? I'm particularly skeptical because a recent trend in Q2 2025 has been benchmaxxing.


Replies

dragonwriteryesterday at 3:45 PM

> I'm skeptical about these claims. How can this be?

More efficient architecture.

> Wouldn't there be massive loss of world knowledge?

If you assume equally efficient architecture and no other salient differences, yes, that’s what you’d expect from a smaller model.

show 1 reply