logoalt Hacker News

mikkupikkuyesterday at 2:34 PM1 replyview on HN

Absolutely. These models still need a lot of this sort of hand holding, so they work best in experienced hands. I'm also skeptical of those very long runs, letting it go so long without active oversight must surely produce at least some objectionable design or implementation details, right? So I guess the people claiming those sort of results have less care for these sort of qualities.


Replies

KellyCriterionyesterday at 4:58 PM

Yes, even Claude Opus 4.6 is still running into accidents on longer chats which lasts for 3 - 4 days. But its getting better and better.