logoalt Hacker News

thot_experimentyesterday at 5:47 AM5 repliesview on HN

Flat wrong. Q6 Gemma 31b feels a lot like opus 4.5 to me when run in a harness so it can retrieve information and ground itself. The gap is not that big for a lot of usecases. Qwen MoE is fast as fuck locally for things that are oneshottable. I have subscriptions to all the major providers right now and since Gemma 4 and Qwen 3.6 came out I haven't hit limits a single time. I'm actually super surprised by the number of things I try with Gemma 4 with the intent of seeing how it fails and then having Claude do it only to come away with something perfectly usable from the local model.


Replies

cbg0yesterday at 6:13 AM

Your n=1 might not be very relevant outside your personal use. In less contaminated benchmarks Gemma 4 is way below Sonnet 4.5, let alone Opus models: https://swe-rebench.com/

show 3 replies
stuaxoyesterday at 8:27 AM

What harness are you using ?

I'm going to switch to local LLMs for most stuff soon.

show 1 reply
root_axisyesterday at 6:20 AM

Sorry but you're just seeing what you want to see. The idea that a 31b model is anywhere even in the ballpark of something like Opus 4.5 is just absurd on its face.

show 3 replies
alfiedotwtfyesterday at 6:10 AM

I’m guessing Qwen3.6 for agentic coding and Gemma4 for non-coding stuff?

show 1 reply
KurSixyesterday at 10:22 AM

[flagged]