I think the real value here isn’t “planning vs not planning,” it’s forcing the model to surface its ...

sparin9 • today at 12:50 PM • 5 replies • view on HN

I think the real value here isn’t “planning vs not planning,” it’s forcing the model to surface its assumptions before they harden into code.

LLMs don’t usually fail at syntax. They fail at invisible assumptions about architecture, constraints, invariants, etc. A written plan becomes a debugging surface for those assumptions.

Replies

remify • today at 1:58 PM

Sub agent also helps a lot in that regard. Have an agent do the planning, have an implementation agent do the code and have another one do the review. Clear responsabilities helps a lot.

There also blue team / red team that works.

The idea is always the same: help LLM to reason properly with less and more clear instructions.

➕ show 4 replies

asdxrfx • today at 2:54 PM

It's also great to describe the full use case flow in the instructions, so you can clearly understand that LLM won't do some stupid thing on its own

hun3 • today at 1:37 PM

Except that merely surfacing them changes their behavior, like how you add that one printf() call and now your heisenbug is suddenly nonexistent

maccard • today at 1:41 PM

> LLMs don’t usually fail at syntax?

Really? My experience has been that it’s incredibly easy to get them stuck in a loop on a hallucinated API and burn through credits before I’ve even noticed what it’s done. I have a small rust project that stores stuff on disk that I wanted to add an s3 backend too - Claude code burned through my $20 in a loop in about 30 minutes without any awareness of what it was doing on a very simple syntax issue.

➕ show 1 reply

MagicMoonlight • today at 2:48 PM

Did you just write this with ChatGPT?

➕ show 1 reply

alt Hacker News

Replies