Flat wrong. Q6 Gemma 31b feels a lot like opus 4.5 to me when run in a harness so it can retrieve in...

thot_experiment • yesterday at 5:47 AM • 5 replies • view on HN

Flat wrong. Q6 Gemma 31b feels a lot like opus 4.5 to me when run in a harness so it can retrieve information and ground itself. The gap is not that big for a lot of usecases. Qwen MoE is fast as fuck locally for things that are oneshottable. I have subscriptions to all the major providers right now and since Gemma 4 and Qwen 3.6 came out I haven't hit limits a single time. I'm actually super surprised by the number of things I try with Gemma 4 with the intent of seeing how it fails and then having Claude do it only to come away with something perfectly usable from the local model.

Replies

cbg0 • yesterday at 6:13 AM

Your n=1 might not be very relevant outside your personal use. In less contaminated benchmarks Gemma 4 is way below Sonnet 4.5, let alone Opus models: https://swe-rebench.com/

➕ show 3 replies

stuaxo • yesterday at 8:27 AM

What harness are you using ?

I'm going to switch to local LLMs for most stuff soon.

➕ show 1 reply

root_axis • yesterday at 6:20 AM

Sorry but you're just seeing what you want to see. The idea that a 31b model is anywhere even in the ballpark of something like Opus 4.5 is just absurd on its face.

➕ show 3 replies

alfiedotwtf • yesterday at 6:10 AM

I’m guessing Qwen3.6 for agentic coding and Gemma4 for non-coding stuff?

➕ show 1 reply

KurSix • yesterday at 10:22 AM

[flagged]

alt Hacker News

Replies