logoalt Hacker News

irthomasthomasyesterday at 10:07 AM1 replyview on HN

In my own tests I have found opus to be very good at writing plans, terrible at executing them. It typically ignores half of the constraints. https://x.com/xundecidability/status/2019794391338987906?s=2... https://x.com/xundecidability/status/2024210197959627048?s=2...


Replies

Sammiyesterday at 10:51 AM

1. Don't implement too much at at time

2. Have the agent review if it followed the plan and relevant skills accurately.

show 1 reply