This is true, we have the numbers to back it up on https://gertlabs.com/rankings?mode=oneshot_coding (check out the efficiency chart too)
GPT 5.5/5.4 are the smartest models, but at great token / code bloat cost. Qwen 3.6 Max strikes a good balance. But Gemma 4 26B writes some really efficient code, with great results considering the model size. Things do start falling apart under higher contexts.