What? 35B-A3B is not nearly as smart as 27B.

Miraste • yesterday at 2:51 PM • 3 replies • view on HN

Replies

One interesting thing about Qwen3 is that looking at the benchmarks, the 35B-A3B models seem to be only a bit worse than the dense 27B ones. This is very different from Gemma 4, where the 26B-A4B model is much worse on several benchmarks (e.g. Codeforces, HLE) than 31B.

➕ show 1 reply

ekianjo • yesterday at 2:56 PM

yeah the 27B feels like something completely different. If you use it on long context tasks it performs WAY better than 35b-a3b

➕ show 1 reply

zkmon • yesterday at 2:55 PM

Yes.

alt Hacker News

Replies