logoalt Hacker News

sgctoday at 12:34 AM3 repliesview on HN

As far as I can tell this type of model requires 640GB+ of memory using FP8. So likely can be run using 320GB+ memory if using FP4 or similar. So that would be 3 Nvidia DGX Sparks, or 12k of hardware. Is that correct? If so, it could make perfect sense for a small business.


Replies

SwellJoetoday at 5:12 AM

The performance would be abysmal spread across four Sparks, I'd think, though I guess MoE mitigates that somewhat. Still better to just pay for it in the cloud. (Though I've spent about $4k on local compute for AI experimentation, I don't think it pays for itself, I just like tinkering.)

Tepixtoday at 3:07 AM

You probably need four of them in practice.

wgdtoday at 3:17 AM

[dead]