>> * Moving code off of GitHub doesn't change any of this: AI companies are free to download your git repo no matter where it is hosted, just like they can any other content on a publicly accessible website.
C'mon, I'm not even apart of the movement to move away from GitHub, but that's not really a valid argument. Sure, they CAN download the source code, but its not nearly as automatic. They don't get to download it all, en masse, from copying hard drives/databases they already own. They have to go over the internet. They don't get automatic notifications when new code gets pushed. And finally, if one wanted, they can make it harder for bots.
I certainly believe that these companies do get away with a lot more than the average Joe - see: Facebook downloading Anna's Archive, every pirated eBook - but that doesn't mean you have to hand it to them on a silver platter.
Plus, even if your code is private on GitHub, you can guarantee that they can't train there models on it anyway; unlike if you host it yourself, or somewhere else.
Does anyone else find it ironic when closed-source GitHub claims it's some super hero for open source?