Not for inferencing. M3 Ultra runs big LLMs twice as fast as RTX 5090.
https://creativestrategies.com/mac-studio-m3-ultra-ai-workst...
RTX 5090 only has 32GB RAM. M3 Ultra has up to 512 GB with 819 GB/sec bandwidth. It can run models that will not fit on an RTX card.
EDIT: Benchmark may not be properly utilizing the 5090. But the M3 Ultra is way more capable than an entry level RTX card at LLM inferencing.
It can run models that cannot fit on TEN rtx 5090s (yes, it can run DeepSeek V3/R1, quantized at 4 bit, at a honest 18-19 tok/s, and that's a model you cannot fit into 10 5090s..).
My little $599 Mac Mini does inference about 15-20% slower than a 5070 in my kids’ gaming rig. They cost about the same, and I got a free computer.
Nvidia makes an incredible product, but apples different market segmentation strategy might make it a real player in the long run.