logoalt Hacker News

avidphantasmyesterday at 10:59 PM1 replyview on HN

Not sure where 40 tokens per second is coming from. I’ve seen 95-100 tokens per second on M5 Max 128GB running Gemma 4 31B. I’ve done experiments where it is faster than Claude Opus 4.5 for the same prompts.


Replies

dhiraj_bhakta_today at 4:52 AM

can you provide your configurations pls ?