It's interesting to me that OpenAI considers scraping to be a form of abuse.
Quite sure even literal thieves would consider thievery a form of abuse.
They don't want anyone to take that which they have rightfully stolen.
The levels of irony that shouldn't be possible...
The irony is thick
Seriously. The hypocrisy is staggering!
Church, politicians, moralists are all the biggest hypocrites that want to teach you something.
" Integrity at OpenAI .. protect ... abuse like bots, scraping, fraud "
Did you mean to use the word hypocrisy. If not, I'm happy to have said it.
I just want to note, that it is well covered how good the support is for actual malware...
"You're trying to kidnap what I've rightfully stolen!"
And have absolutely no reservations about making such an obvious statement on a public forum
I interpreted scraping to mean in the context of this:
> we want to keep free and logged-out access available for more users
I have no doubt that many people see the free ChatGPT access as a convenient target for browser automation to get their own free ChatGPT pseudo-API.
This
It's not scraping they're concerned about, it's abusing free GPU resources to (anonymously) generate (abusive) content.
Scraping static content from a website at near-zero marginal cost to its server, vs scraping an expensive LLM service provided for free, are different things.
The former relies on fairly controversial ideas about copyright and fair use to qualify as abuse, whereas the latter is direct financial damage – by your own direct competitors no less.
It's fun to poke at a seeming hypocrisy of the big bad, but the similarity in this case is quite superficial.
I think the distinction is less about scraping itself, and more about marginal cost.
Scraping static pages is cheap for both sides. Scraping an LLM-backed service effectively externalizes compute costs onto the provider.
Same behavior, very different economics.
It’s funny because the first AI scraper I remember blocking was from OpenAI’s, as it got stuck in a loop somehow and was impacting the performance of a wiki I run. All to violate every clause of the CC BY-NC-SA license of the content it was scraping :)