I realize my SpongeBob post came off flippant, and that wasn't the intent. The Spongebob ASCII ...

irthomasthomas • yesterday at 4:07 PM • 0 replies • view on HN

I realize my SpongeBob post came off flippant, and that wasn't the intent. The Spongebob ASCII test (picked up from Qwen's own Twitter) is explicitly a rote-memorization probe; bigger dense models usually ace it because sheer parameter count can store the sequence

With Qwen3's sparse-MoE, though, the path to that memory is noisier: two extra stochastic draws (a) which expert(s) fire, (b) which token gets sampled from them. Add the new gated-attention and multi-token heads and you've got a pipeline where a single routing flake or a dud expert can break vertical alignment halfway down the picture.

Anyway, I think qwen3-coder was uniquely trained on this - so it's not a fair comparison. Here are some other qwen3 models:

Model: chutes/Qwen/Qwen3-235B-A22B

   /~\       
  (  *  *  )  
  (  o o o  )  
   \  -  /     
    \  /\  /   
     \  /      
      \/       
     /|||\      
    /|||||\     
   /||||||||\   
  ( o   o   o ) 
   \   W   /   
    \___/

Model: chutes/Qwen/Qwen3-235B-A22B-Instruct-2507

    /\_/\  
   ( o.o ) 
    > ^ <  
   /     \ 
  |       |
  |       |
   \     /
    '-'-'

Model: chutes/Qwen/Qwen3-235B-A22B-Thinking-2507

  .-----------.
  | []     [] |
  |           |
  |   __  __  |
  |  |  ||  | |
  |  |__||__| |
  |           |
  '-----------'

Model: chutes/Qwen/Qwen3-Next-80B-A3B-Instruct

        __
       /  \
      /    \
     /      \
    /        \
   /__________\
  |  o      o  |
  |     __     |
  |  \____/    |
  |            |
  |  ________  |
  |  \      /  |
  |   \____/   |
  |            |
  |____________|
  
     ___________
    /           \
   /             \
  |   _______     |
  |  |       |    |
  |  |  ___  |    |
  |  | |   | |    |
  |  | |___| |    |
  |  |_______|    |
  |               |
  |_______________|

Model: chutes/Qwen/Qwen3-Next-80B-A3B-Thinking

  .-.
   /   \
  |  o o|
  |  >  |
  |  ---|
   \___/

Model: chutes/Qwen/Qwen3-30B-A3B-Instruct-2507

    _________________________
   /                         \
  |   _     _     _     _   |
  |  / \   / \   / \   / \  |
  | |   | |   | |   | |   | |
  |  \_/   \_/   \_/   \_/  |
  |                         |
  |   _     _     _     _   |
  |  / \   / \   / \   / \  |
  | |   | |   | |   | |   | |
  |  \_/   \_/   \_/   \_/  |
  |                         |
  |    SpongeBob SquarePants   |
  |_________________________|

alt Hacker News