But then, the ownership is clear. And no team would be like to be pointed that their 5th iteration is also broken and can’t be relied for production usage. That’s the difference with AI code. LLM are not aligned with your goals. Any trust in them doing the right thing is very misguided.
That's why you have them write tons of tests. Way more than you generally would for human written code. And the agent writing/maintaining the tests is not the agent fixing the bugs.
I've personally had a LLM write an image resizing library for me. It's a fairly basic one, I didn't need anything fancy. I could have used something off the shelf but it was at a time when I was testing what Claude could do. And to be honest, it just worked. One shot, if I recall correctly, or at least, one session with a few tweaks and never touched again. It's been embedded in a larger app for several months and I don't recall hitting a single bug with that, specifically. So I'm not sure your complaints about "the 5th iteration" being broken have much grounds here.