Nothing of what you write here matches my experience with AI.
Specification is worth writing (and spending a lot more time on than implementation) because it's the part that you can still control, fully read, understand etc. Once it gets into the code, reviewing it will be a lot harder, and if you insist on reviewing everything it'll slow things down to your speed.
> If the cost of writing code is approaching zero, there's no point investing resources to perfect a system in one shot.
THe AI won't get the perfect system in one shot, far from it! And especially not from sloppy initial requirements that leave a lot of edge (or not-so-edge) cases unadressed. But if you have a good requirement to start with, you have a chance to correct the AI, keep it on track; you have something to go back to and ask other AI, "is this implementation conforming to the spec or did it miss things?"
> five different versions of the thing you're building and simply pick the best one.
Problem is, what if the best one is still not good enough? Then what? You do 50? They might all be bad. You need a way to iterate to convergence
This. Waterfall never worked for a reason. Humans and agents both need to develop a first draft, then re-evaluate with the lessons learned and the structure that has evolved. It’s very very time consuming to plan a complex, working system up front. NASA has done it, for the moon landing. But we don’t have those resources, so we plan, build, evaluate, and repeat.