Interesting trade-off: build a minimal interpreter that's "good enough" for AI-generated code rather than trying to match CPython feature-for-feature.
The security angle is probably the most compelling part. Running arbitrary AI-generated Python in a full CPython runtime is asking for trouble — the attack surface is enormous. Stripping it down to a minimal subset at least constrains what the generated code can do.
The bet here seems to be that AI-generated code can be nudged to use a restricted subset through error feedback loops, which honestly seems reasonable for most tool-use scenarios. You don't need metaclasses and dynamic imports to parse JSON or make API calls.