Now we need someone try run Kimi K2.6 on old Xeon and DDR3. After all these platforms do support up to 768GB RAM.
[flagged]
[dead]
> The argument for speculative decoding is stronger on CPU than on GPU.
Uh. Uuuh.
No?
___
Also
> While a GPU has a massive pool of ultra-fast High-Bandwidth Memory (HBM), a CPU relies on small, lightning-fast “caches” (L1, L2, L3) built directly onto the processor chip.
What purpose does the quoting of "caches" serve there? Is this AI writing written by that model running on that host?
Might consider going for even older CPUs which don't have the Intel ME ring -3 thing which is full of backdoors