This is genuinely very helpful. I'm planning a MacBook pro purchase with local inference in mind and now see I'll have to aim for a slightly higher memory option because the Gemma A4 26B MoE is not all that!
I have upgraded my M4 Pro 24GB to M5 Pro 48GB yesterday. The same Gemma 4 MoE model (4bit, don't remember which version) runs about 8x faster on M5 Pro and loads 2x times faster in memory.
So yes, do purchase that new MacBook Pro.
If you're doing it specifically for inference (or in most other situations) a Mac(book) represents very low RoE.
pretty sure Nvidia GPU is better bang for buck because of usable inference speed..