> Running DeepSeek V3 (685B) requires 8×H100 GPUs which is about $14k/month. Most developers only need 15-25 tok/s.
> deepseek-v3.2-685b, $40/mo/slot for ~20 tok/s, 465 slots total
> 465 users × 20 tok/s = 9,300 tok/s needed
> The node peaks at ~3,000 tok/s total. So at full capacity they can really only serve:
> 3,000 ÷ 20 = 150 concurrent users at 20 tok/s
> That's only 32% of the cohort being active simultaneously.
People work 8 hours a day presumably, I guess they are banking on this idea