I’ve been playing around with these kinds of prompts. My experience is that the prompts need a lot of iteration to truly one-shot something that is halfway usable. If it’s under-spec’d it’ll just return after 15-20 minutes with something that’s not even half baked. If I give it an extremely detailed spec it’ll start dropping requirements and then finish around the 60-70 minute mark, but I needed 20 minutes to write the prompt and I need to hunt for the things it didn’t bother to do.
I’ve gotten some success iterating on the one-shot prompt until it’s less work to productionize the newest artifact than to start over, and it does have some learning benefits to iterate like this. I’m not sure if it’s any faster than just focusing on the problem directly though.
The dropping requirements problem is real. What's helped us is breaking the spec into numbered ACs and having the verification run per-criterion. If AC-3 fails you know exactly what got dropped.