I think the interesting part here is that the black box gives us a starting point. And it's not a true black box: if a model is generating Lean, we can examine the entire source, iteratively refine by refactoring into smaller self-contained units, and the compiler gives us the certainty that our decomposition is correct.