logoalt Hacker News

breputtoday at 12:46 AM1 replyview on HN

That's good information. I couldn't possibly even start to run even DeepSeek Flash on my system, but also if you're assuming multiple GPUs, that is going to affect the napkin math.


Replies

martinaldtoday at 9:43 AM

The point is that tok/s/GPU stays ~roughly stable. So you need say 4 GB200s minimum to fit the modules, but this provides 4x the tok/s as 1 GPU.