logoalt Hacker News

taspeotistoday at 8:16 AM3 repliesview on HN

The harness is super important, what tools are available and the system prompts vary from harness to harness.

Anthropic seems to have a modest lead on their harness and models, so it’s a best-of-both-worlds scenario.

> I'm not sure what Microsoft is doing behind the scenes

It’s probably the exact same model, but the tools and the prompts around it are worse, so you get worse results.


Replies

irthomasthomastoday at 11:28 AM

Claude in Claude code has been shown to perform persistently worse in evals than claude + a minimal harness.

kilburntoday at 9:55 AM

The harness was absolutely not an issue in my case.

The new pricing model where I got banned from using Opus entirely and half a day of work (with weaker models) consumed the 10$ plan was.

I'm now using a Claude Max subscription and I can get close to the daily limits but I'm fairly happy with the overall plan consumption.

Vinnltoday at 8:35 AM

So if you use Claude via Copilot in Zed... You use Zed's harness, I think? What does Copilot do, at that point?

show 2 replies