News like this always makes me wonder about running my own model, something I've never done. A ...

beej71 • yesterday at 5:03 PM • 8 replies • view on HN

News like this always makes me wonder about running my own model, something I've never done. A couple thousand bucks can get you some decent hardware, it looks like, but is it good for coding? What is your all's experience?

And if it's not good enough for coding, what kind of money, if any, would make it good enough?

Replies

arcanemachiner • yesterday at 5:23 PM

I want to give give you realistic expectations: Unless you spend well over $10K on hardware, you will be disappointed, and will spend a lot of time getting there. For sophisticated coding tasks, at least. (For simple agentic work, you can get workable results with a 3090 or two, or even a couple 3060 12GBs for half the price. But they're pretty dumb, and it's a tease. Hobby territory, lots of dicking around.)

Do yourself a favor: Set up OpenCode and OpenRouter, and try all the models you want to try there.

Other than the top performers (e.g. GLM 5.1, Kimi K2.5, where required hardware is basically unaffordable for a single person), the open models are more trouble than they're worth IMO, at least for now (in terms of actually Getting Shit Done).

➕ show 1 reply

efficax • yesterday at 9:43 PM

gemma4 and qwen3.6 are pretty capable but will be slower and wrong more often than the larger models. But you can connect gemma4 to opencode via ollama and it.. works! it really can write and analyze code. It's just slow. You need serious hardware to run these fast, and even then, they're too small to beat the "frontier" models right now. But it's early days

mfro • yesterday at 5:35 PM

Not sure why all the other commentors are failing to mention you can spend considerably less money on an apple silicon machine to run decent local models.

Fun fact: AWS offers apple silicon EC2 instances you can spin up to test.

__mharrison__ • yesterday at 6:41 PM

My anecdotal experience with a recent project (Python library implemented and released to pypi).

I took the plan that I used from Codex and handed it to opencode with Qwen 3.5 running locally.

It created a library very similar to Codex but took 2x longer.

I haven't tried Qwen 3.6 but I hear it's another improvement. I'm confident with my AI skills that if/when cheap/subsidized models go away, I'll be fine running locally.

bakugo • yesterday at 5:18 PM

You should be aware that any model you can run on less than $10k worth of hardware isn't going to be anywhere close to the best cloud models on any remotely complex task.

Many providers out there host open weights models for cheap, try them out and see what you think before actually investing in hardware to run your own.

hleszek • yesterday at 5:08 PM

The latest Qwen3.6 model is very impressive for its size. Get an RTX 3090 and go to https://www.reddit.com/r/LocalLLaMA/ to see the latest news on how to run models locally. Totally fine for coding.

aray07 • yesterday at 5:04 PM

i think the new qwen models are supposed to be good based on some the articles that i read

DeathArrow • yesterday at 6:15 PM

Unless you use H100 or 4x 5090 you won't get a decent output.

The best bang for the buck now is subcribing to token plans from Z.ai (GLM 5.1), MiniMax (MiniMax M2.7) or ALibaba Cloud (Qwen 3.6 Plus)

Running quantized models won't give you results comparable to Opus or GPT.

alt Hacker News

Replies