Is it honestly better than Opus 4.6 or just benchmaxxed? Have you done any coding with an agent harn...

doctoboggan • today at 3:57 AM • 3 replies • view on HN

Is it honestly better than Opus 4.6 or just benchmaxxed? Have you done any coding with an agent harness using it?

If its coding abilities are better than Claude Code with Opus 4.6 then I will definitely be switching to this model.

Replies

Apparently glm5.1 and qwen coder latest is as good as opus 4.6 on benchmarks. So I tried both seriously for a week (glm Pro using CC) and qwen using qwen companion. Thought I could save $80 a month. Unfortunately after 2 days I had switched back to Max. The speed (slower on both although qwen is much faster) and errors (stupid layout mistakes, inserting 2 footers then refusing to remove one, not seeing obvious problems in screenshots & major f-ups of functionality), not being able to view URLs properly, etc. I'll give deepseek a go but I suspect it will be similar. The model is only half the story. Also been testing gpt5.4 with codex and it is very almost as good as CC... better on long running tasks running in background. Not keen on ChatGPT codex 'personality' so will stick to CC for the most part.

madagang • today at 4:16 AM

Their Chinese announcement says that, based on internal employee testing, it is not as good as Opus 4.6 Thinking, but is slightly better than Opus 4.6 without Thinking enabled.

➕ show 4 replies

alt Hacker News

Replies