logoalt Hacker News

balnazzarlast Saturday at 11:03 PM1 replyview on HN

It can run models that cannot fit on TEN rtx 5090s (yes, it can run DeepSeek V3/R1, quantized at 4 bit, at a honest 18-19 tok/s, and that's a model you cannot fit into 10 5090s..).


Replies

voidsparklast Saturday at 11:22 PM

Right, that's the $9500 Mac Studio with 512GB RAM and 80-core GPU.

16x the RAM of RTX 5090.

There are two versions of the M3 Ultra

28-core CPU, 60-core GPU

32-core CPU, 80-core GPU

Both have a 32-core Neural Engine.