Gemma4 in my view is good enough to do things similar to Gemini 2.5 flash, meaning if I point it cod...

amazingamazing • today at 4:10 AM • 11 replies • view on HN

Gemma4 in my view is good enough to do things similar to Gemini 2.5 flash, meaning if I point it code and ask for help and there is a problem with the code it’ll answer correctly in terms of suggestions but it’s not great at using all tools or one shooting things that require a lot of context or “expert knowledge”

If a couple more iterations of this, say gemma6 is as good as current opus and runs completely locally on a Mac, I won’t really bother with the cloud models.

That’s a problem.

For the others anyway.

Replies

mark_l_watson • today at 12:51 PM

I agree. At first I was really turned off by the Gemma 4 line of models because they didn’t function with coding agents as well as the qwen3.5 line of models. However, I found that for other use cases Gemma 4 was very good.

EDIT: I just saw this: “”Ollama 0.20.6 is here with improved Gemma 4 tool calling!”” I will rerun my tests after breakfast.

swazzy • today at 5:35 AM

similar vibes as "640k ought to be enough for anybody"

➕ show 4 replies

docstryder • today at 1:56 PM

The economy is, more or less, a competition.

If someone gets a really great axe and are happy with it, that’s great for them.

But then, other people will be on bulldozers.

They can say they are happy with the axe, but then they are not in the competition at that point.

➕ show 1 reply

blitzar • today at 6:01 AM

> it’s not great at using all tools

Glad it wasnt just me - i was impressed with the quality of Gemma4 - it just couldnt write the changes to file 9/10 times when using it with opencode

➕ show 2 replies

slopinthebag • today at 4:19 AM

Yep, and to be honest we don't really need local models for intensive tasks. At least yet. You can use openrouter (and others) to consume a wide variety of open models which are capable of using tools in an agentic workflow, close to the SOTA models, which are essentially commodities - many providers, each serving the same model and competing with each-other on uptime, throughput, and price. At some point we will be able to run them on commodity hardware, but for now the fact that we can have competition between providers is enough to ensure that rug pulls aren't possible.

Plus having Gemma on my device for general chat ensures I will always have a privacy respecting offline oracle which fulfils all of the non-programming tasks I could ever want. We are already at the point where the moat for these hyper scalers has basically dissolved for the general public's use case.

If I was OpenAI or Anthropic I would be shitting my pants right now and trying every unethical dark pattern in the book to lock in my customers. And they are trying hard. It won't work. And I won't shed a single tear for them.

colechristensen • today at 4:32 AM

Local models seem somewhere between 9 and 24 months behind. I'm not saying I won't be impressed with what online models will be able to do in two years, but I'm pretty satisfied with the prediction that I won't really need them in a couple of years.

➕ show 1 reply

logicallee • today at 10:37 AM

> if I point it code and ask for help and there is a problem with the code it’ll answer correctly in terms of suggestions

could I ask how you do that? I installed openclaw and set it to use Gemma 4 but it didn't act in an agent mode at all, it only responded in the chat window while doing nothing, and didn't read any files or do anything that you wrote (though I see you do mention that it's not great at using all tools). What are you using exactly?

➕ show 1 reply

vasco • today at 9:30 AM

But that difference atm is the difference between it being OK on its own with a team of subagents given good enough feedback / review mechanisms or having to babysit it prompt by prompt.

By the time gemma6 allows you to do the above the proprietary models supposedly will already be on the next step change. It just depends if you need to ride the bleeding edge but specially because it's "intelligence", there's an obvious advantage in using the best version and it's easy to hype it up and generate fomo.

➕ show 1 reply

phantomoc • today at 2:31 PM

[dead]

gorgmah • today at 7:50 AM

When that happens, you'll have fomo from not using opus 5.x. The numbers that they showed for Mythos show that the frontier is still steadily moving (and maybe even at a faster pace than before)

➕ show 1 reply

blcknight • today at 6:18 AM

There is a cognitive ceiling for what you can do with smaller models. Animals with simpler neural pathways often outperform whatever think they are capable of but there's no substitute for scale. I don't think you'll ever get a 4B or 8B model equivalent to Opus 4.6. Maybe just for coding tasks but certainly not Opus' breadth.

➕ show 3 replies

alt Hacker News

Replies