Not sure where 40 tokens per second is coming from. I’ve seen 95-100 tokens per second on M5 Max 128...

avidphantasm • yesterday at 10:59 PM • 1 reply • view on HN

Not sure where 40 tokens per second is coming from. I’ve seen 95-100 tokens per second on M5 Max 128GB running Gemma 4 31B. I’ve done experiments where it is faster than Claude Opus 4.5 for the same prompts.

Replies

dhiraj_bhakta_ • today at 4:52 AM

can you provide your configurations pls ?

alt Hacker News

Replies