I was getting dangerously close to my weekly Claude Code limit last night so I had Claude set up Qwe...

briga • yesterday at 3:23 PM • 10 replies • view on HN

I was getting dangerously close to my weekly Claude Code limit last night so I had Claude set up Qwen3.6 with llama.cpp and OpenCode. Honestly it's a great (free!) alternative to Claude Code--certainly more than good enough for a lot of smaller less complex tasks. I'm excited to try this new version. The fact that open-source models are so close to the frontier is very impressive.

Replies

pixelesque • yesterday at 5:16 PM

Out of interest, what machine and model are you running it on?

I tried the qwen3.6-27b Q6_k GUFF in llama.cpp and LM Studio on my M2 MacBook Pro 32GB machine last week, and I barely get a token a second with either.

What sort of speed should I be expecting?

I tried some of the Llama 3 34b (nous-capybara?) models two years ago with llama.cpp, and I seem to remember getting a few tokens a second then, so not sure if I've got something completely mis-configured, or I just have unreasonable expectations.

Or maybe qwen 3.x is slower for some reason? (Is it mixture of experts?)

I'm not expecting it to be instant, but what I'm currently seeing is not really usable.

plufz • yesterday at 3:33 PM

Which exact model are you using? And with which parameters and quant? And on what hardware? Are you using any specific MCPs or other tools to optimize performance like context-mode or dynamic context pruning? I’ve used local models a reasonable amount before but I’m just starting out with opencode. Haven’t had great results yet but really want this to work for simpler tasks. My opencode newly installed is also having iterm on 100% cpu in idle. :/

➕ show 2 replies

leonidasv • yesterday at 3:32 PM

Qwen Max are usually closed, unfortunately.

wuliwong • yesterday at 7:18 PM

Do you have a feel for how it Qwen 3.6 compares to Sonnet 4.6? B/C in reality, that's what we use a lot. If we just use Opus 4.7 for everything code related, we'd have a monthly bill 10-20 times higher than using Sonnet where we can.

➕ show 1 reply

ecshafer • yesterday at 4:41 PM

Qwen3.6 with claude code works great. I get a lot better results with that than opencode and qwen3.6. Claude Code is a great harness, and good harness/tool integration makes a big difference. You just have a settings.json with your ollama setup and the qwen model and you can use it.

kolinko • yesterday at 6:00 PM

As Opus maximalist ;) I was very surprised by the quality if Qwen3.6-27B - trying to figure out how to get it going on RTX 90k now to offload some lighter tasks :)

aembleton • yesterday at 8:44 PM

> Today we introduce Qwen3.7-Max, our latest proprietary model

This is not an open model

ttoinou • yesterday at 7:20 PM

Which agentic coding tool and how do you make sure you have prefix consistency ?

wouldbecouldbe • yesterday at 5:21 PM

This one doesnt seem to be open source though sadly. Using chinese servers is a step to far for me personally

➕ show 1 reply

par • yesterday at 5:42 PM

Do you have an opinion on OpenCode vs Aider?

➕ show 2 replies

alt Hacker News

Replies