Randomness is not a problem by itself. Algorithms in BQP are probabilistic too. Different prompts might have different probabilities of successful generation, so refinement could be possible even for stochastic generation.
And provably correct one-shot program synthesis based on an unrestricted natural language prompt is obviously an oxymoron. So, it's not like we are clearly missing the target here.
>Different prompts might have different probabilities of successful generation, so refinement could be possible even for stochastic generation.
Yes, but that requires a formal specification of what counts as "success".
In my view, LLM based programming has to become more structured. There has to be a clear distinction between the human written specification and the LLM generated code.
If LLMs are a high level programming language, it has to be clear what the source code is and what the object code is.