I don't think "large projects" is realistic with a model that fits in ~8GB (I'm ...

SwellJoe • today at 12:53 AM • 0 replies • view on HN

I don't think "large projects" is realistic with a model that fits in ~8GB (I'm assuming you run stuff other than the model). And, Gemma 4 12B QAT at 4-bits is surely the smartest model in its size, but it shines at vision tasks rather than agentic tasks (though it is a good tool user and can do stuff like research, it's obviously not aimed at code).

You can almost always find free models on OpenRouter. Google AI Studio also has free usage of Gemma 4. Both are rate and usage limited which agentic use will probably chew up pretty quick, but you can usually find some pretty powerful models for free. If you rotate through different providers, I think it avoids the cap. Currently several Nemotron models, North Mini Code, Laguna models, Gemma 4 31b and MoE, Qwen 3 Next, and gpt-oss 120b, are all available free on OpenRouter...and better than anything you can run locally in ~8GB.

alt Hacker News