> The Qwen3-Next-80B-A3B-Instruct performs comparably to our flagship model Qwen3-235B-A22B-Instruct-2507
I'm skeptical about these claims. How can this be? Wouldn't there be massive loss of world knowledge? I'm particularly skeptical because a recent trend in Q2 2025 has been benchmaxxing.
> I'm skeptical about these claims. How can this be?
More efficient architecture.
> Wouldn't there be massive loss of world knowledge?
If you assume equally efficient architecture and no other salient differences, yes, that’s what you’d expect from a smaller model.