logoalt Hacker News

voidsparklast Saturday at 7:16 PM1 replyview on HN

M3 Ultra has a big GPU with 819 GB/sec bandwidth.

LLM performance is twice as fast as RTX 5090

https://creativestrategies.com/mac-studio-m3-ultra-ai-workst...


Replies

behnamohlast Saturday at 7:22 PM

> LLM performance is twice as fast as RTX 5090

your tests are wrong. you used MLX for Mac Studio (optimized for Apple Silicon) but you didn't use vLLM for 5090. There's no way a machine with half the bandwidth of 5090 delivers twice as fast tok/s.

show 2 replies