logoalt Hacker News

sparin9today at 12:50 PM5 repliesview on HN

I think the real value here isn’t “planning vs not planning,” it’s forcing the model to surface its assumptions before they harden into code.

LLMs don’t usually fail at syntax. They fail at invisible assumptions about architecture, constraints, invariants, etc. A written plan becomes a debugging surface for those assumptions.


Replies

remifytoday at 1:58 PM

Sub agent also helps a lot in that regard. Have an agent do the planning, have an implementation agent do the code and have another one do the review. Clear responsabilities helps a lot.

There also blue team / red team that works.

The idea is always the same: help LLM to reason properly with less and more clear instructions.

show 4 replies
asdxrfxtoday at 2:54 PM

It's also great to describe the full use case flow in the instructions, so you can clearly understand that LLM won't do some stupid thing on its own

hun3today at 1:37 PM

Except that merely surfacing them changes their behavior, like how you add that one printf() call and now your heisenbug is suddenly nonexistent

maccardtoday at 1:41 PM

> LLMs don’t usually fail at syntax?

Really? My experience has been that it’s incredibly easy to get them stuck in a loop on a hallucinated API and burn through credits before I’ve even noticed what it’s done. I have a small rust project that stores stuff on disk that I wanted to add an s3 backend too - Claude code burned through my $20 in a loop in about 30 minutes without any awareness of what it was doing on a very simple syntax issue.

show 1 reply
MagicMoonlighttoday at 2:48 PM

Did you just write this with ChatGPT?

show 1 reply