logoalt Hacker News

greenavocadotoday at 5:16 PM1 replyview on HN

Qwen 3.6 27B still curb stomps Deepseek V4 in coding


Replies

epolanskitoday at 5:50 PM

1. Deepseek V4 is still in preview (training is not finished)

2. Qwen is much more demanding and borderline unusable on consumer hardware because it's a dense model. The 27B parameters are active all time for each token. It's not a MoE architecture where a router activates only some of them.

3. Qwen doesn't like quantization at all.

show 3 replies