logoalt Hacker News

all2yesterday at 8:18 PM2 repliesview on HN

I've been watching the drizzle of LLM papers come through, and I think we're going to hit a 1T param MoE on consumer hardware before this year is out. It'll still be behind the bigco models, but it'll be a force multiplier. Ideally, we'd get these models to run on a CPU. MS BitNet is one way to do this. You can already run ternary LLMs on consumer CPUs with a decent tps.


Replies

nltoday at 4:26 AM

I mean you can run a 1T model on consumer hardware now by doing things like layer offloading and streaming from SSD. It's just too slow to be useful.

mattmanseryesterday at 10:55 PM

Though what is consumer hardware right now?

Can we still classify 5090s as consumer hardware given how expensive they are? They're £3k at the moment, and it looks like it's only going to get worse unless the AI bubble pops.

show 1 reply