logoalt Hacker News

isityettimetoday at 3:08 PM1 replyview on HN

> i bet 99% of people here, if presented with a test where i gave 5 models but all of the results came from one, would not be able to discern this. Just vibes all the way down.

This is complicated by the way that the coding agents inject prompts that preempt and potentially undermine user instructions. I suspect that one of the reasons Codex works way better for me than Claude Code in certain projects is that the latter adds some garbage like "go ahead and write repetitive copy/paste code, keep it simple, take shortcuts" to every session. A fair test would have to hide but more or less still use the harnesses, not just the models.


Replies