> Honestly the model that works best for me is treating agents like junior devs working under a senior lead. The expert already knows the architecture and what they want. The agents help crank through the implementation but you're reviewing everything and holding them to passing tests. That's where the productivity gain actually is. When non-developers try to use agents to produce entire systems with no oversight that's where things fall apart.
I tried to approach it that way as well, but I am realizing when I let the agent do the implementation, even with clear instructions, I might miss all the “wrong“ design decisions it takes, because if I only review and do not implement I do not discover the “right“ way to build something. Especially in places where I am not so familiar myself — and those are the places where it is most tempting to rely on an agent.
With Claude code I live in plan mode, and ask it to hand me the implementation plan at the low level, with alternatives. It's rare for it to not give me good ones: Better than the junior dev. Then the plan it has is already limited enough that, along with its ability to maintain code style, I will see code very similar to what I would have done. There are a couple of things in the .md file to try to make it take a step or two like the ones I would on naming, shrinking the diff, and refactoring for deduplication. It's not going to go quite as fast as trusting it all to work at a large scale, but it sure looks like my code in the end.