Call me back when you can run these models on 16GB of RAM and any recent i5/i7. Until then, the...

RedCinnabar • yesterday at 5:38 PM • 4 replies • view on HN

Call me back when you can run these models on 16GB of RAM and any recent i5/i7. Until then, there’s no point on using these toy models.

Replies

guax • yesterday at 7:48 PM

Its so funny, these "toy models" would be the wet dreams of researchers not 5 years ago.

Progress marches without mercy.

➕ show 1 reply

giancarlostoro • yesterday at 5:39 PM

You need it to run in about 8 GB so you have extra space for the context window.

Catloafdev • yesterday at 5:40 PM

Hello, it's the internet calling, today is that day.

https://github.com/ikawrakow/ik_llama.cpp

Edit: it's gonna be slow if you're not using any VRAM. But it's possible. Software isn't going to speed that up anytime soon, it's just a hardware bandwidth limit.

jboss10 • yesterday at 8:08 PM

They can be ran on 32GB with 8GB VRAM. I don't think these will be on 16GB for a while. (35B MoE)

➕ show 1 reply

alt Hacker News

Replies