logoalt Hacker News

overgardtoday at 4:57 PM1 replyview on HN

In my case, where I see it most often is when the LLM has to rework something multiple times, and the feedback loop is vague (especially when all I have to give it is "no error messages, but it's still broken"). It seems like after the third or fourth try it just kinda goes off the rails. I find that the one-shot quality tends to be a little better, if the slot machine happened to work correctly that time.


Replies

verdvermtoday at 8:30 PM

You shouldn't be using an LLM directly (web chat style). A proper harness allows an agent to see the errors itself and correct as needed. You can the correct it at higher, more meaningful levels.