logoalt Hacker News

3836293648last Thursday at 9:08 PM2 repliesview on HN

How much VRAM do you need for that?


Replies

canpanlast Friday at 7:05 AM

Not OP, but I ran 122b successfully with normal RAM offloading. You dont need all that much VRAM, which is super expensive. I used 96gb ram + 16gb vram gpu. But it's not very fast in that setup, maybe 15 token per second. Still, you can give it a task and come back later and its done. (Disclaimer: I build that PC before stuff got expensive)

seemazelast Thursday at 10:59 PM

I squeeze Qwen3.5-122B-A10B at Q6 into 128GB. It's a great model.

show 1 reply