You're surprised? I think harnesses are almost as important as the underlying model. Folks have been able to improve benchmark results by nearly 2x based on harness alone.
Harnesses are quickly becoming critical components of the "model" itself imo. Not shocking to me at all that a company that spots a revenue opportunity is keeping its harness closed source.
Source? The most trusted benchmark right now (deepSWE) scores better or just as well on their minimal harness than when using CC or codex
I'm a neophyte. What makes a harness special or all that unique from another? I've had a reasonable experience with Zed and local models, but could be persuaded to put something else in the mix if there is a measurable benefit to be had.