Wasn't the best practice to run one model/coding agent that writes the code and another on...

storus • yesterday at 8:11 PM • 3 replies • view on HN

Wasn't the best practice to run one model/coding agent that writes the code and another one that reviews it? E.g. Claude Code for writing the code, GPT Codex to review/critique it? Different reward functions.

Replies

xandrius • yesterday at 10:25 PM

I think people are misunderstanding reward functions and LLMs.

LLMs don't actually have a reward system like some other ML models.

➕ show 1 reply

8note • today at 12:54 AM

even in one agent, a different starting prompt will have you tracing a very different path through the model.

maybe it still sends you to the same valley, but there's so many parameters and dimensions that i dont think its very likely without also being correct

throwatdem12311 • today at 1:06 AM

It’s superstition that using a different slop generator to “review” the slop from a different brand of slop generator somehow makes things better. It’s slop all the way down.

➕ show 1 reply

alt Hacker News

Replies