logoalt Hacker News

teucrisyesterday at 7:04 PM1 replyview on HN

But agents do keep task lists and check the tasks off as they go. Of course it’s not perfect either but it’s MUCH better than an LLM can offer on its own.

If you are seeing an agent missing tasks, work with it to write down the task list first and then hold it accountable to completing them all. A spec is not a plan.


Replies

mathisfun123yesterday at 7:27 PM

bro do you really not understand that that's a game played for your sake - it checks boxes yes but you have no idea what effect the checking of the boxes actually has. like do you not realize/understand that anthropic/openai is baking this kind of stuff into models/UI/UX to give the sensation of rigor.

show 2 replies