logoalt Hacker News

mschildtoday at 7:16 AM1 replyview on HN

Running models locally is surprisingly easy and possible even on older hardware.

Obviously not the largest, up-to-date models but for what I expect most people use them for, even on hn, there are some shockingly good models that dont require €4k machines.

I have a desktop with an AMD 6900XT and 5600 with 32GB ram. Obviously no slouch but its several years old at this point. I can comfortably run qwen 3.5 9b and get a speedy 60 token/sec output with decent results.


Replies

mock-possumtoday at 7:19 AM

idk I can barely field a 14b on my desktop, and it’s rough trying to replicate the agentic pair programming experience I’m accustomed to with Claude. And I don’t mean it doesn’t work as well, I mean it doesn’t work.

Is there some secret I’m missing? I’ve tried rolling my own harness, and tried a few of the ones the cool kids use - I think pi was the most recent. Not quite my tempo, I’m afraid.

show 2 replies