Traditionally, large corporations have taken very conservative legal stances with regard to integrating e.g. A/GPL code, even when there's almost no risk.
If my license explicitly says "any LLM output trained on this code is legally tainted," I feel like BigAICorp would be foolish to ignore it. Maybe I couldn't sue them today, but are they confident this will remain the case 5, 10, 20 years from now? Everywhere in the world?
Github has posted that they will now train on everyone's data (even private) unless you opt out (until they change their mind on that). Anthropic has been training on your data on certain tiers already. Meta bittorrented books to train their models.
Surely if your license says "LLM output trained on this code is legally tainted", it is going to dissuade them.