Gemma4 is still power hungry since it tends to activate pretty much every weight.
qwen3-coder-next uses a lot less since it seems to only activate ~3B parameters at a time.
My guess is that this is still close to tech demo, and a lot of performance is left on the table.