logoalt Hacker News

josephgtoday at 5:23 AM2 repliesview on HN

Me too. It’s output was fabulous. And it acted like a senior engineer - actually coding up hypotheses, testing them, finding problems and presenting good, usable recommendations backed by solid evidence and wisdom. It can probably do most of my job, which gave me a bit of an existential crisis.

I’ve paused my Claude subscription until they bring it back. Opus makes mistakes constantly, on every level of abstraction. Every time I look closely at its work I find problems.


Replies

zeristortoday at 6:36 AM

Opus 4.8 works like that for me. I have it writing ADRs, then my main architect worker challenging it.

gwerbintoday at 11:44 AM

Even Sonnet does that if you gently prompt it to.