Anyone doing benchmarks with managed runtimes, or serverless, knows it isn't quite true.
Which is exactly one of the AOT only, no GC, crowds use as example why theirs is better.
Reproducible builds exist. AOT/JIT and GC are just not very relevant to this issue, not sure why you brought them up.
Even those are way more predictable than LLMs, given the same input. But more importantly, LLMs aren’t stateless across executions, which is a huge no-no.
But there is functional equivalence. While I don't want to downplay the importance of performance, we're talking about something categorically different when comparing LLMs to compilers.