It's interesting to me that OpenAI considers scraping to be a form of abuse.

Imnimo • yesterday at 10:38 PM • 15 replies • view on HN

Replies

It’s funny because the first AI scraper I remember blocking was from OpenAI’s, as it got stuck in a loop somehow and was impacting the performance of a wiki I run. All to violate every clause of the CC BY-NC-SA license of the content it was scraping :)

raincole • today at 5:07 AM

Quite sure even literal thieves would consider thievery a form of abuse.

➕ show 3 replies

jordanb • today at 2:37 PM

They don't want anyone to take that which they have rightfully stolen.

➕ show 2 replies

axegon_ • today at 7:36 AM

The levels of irony that shouldn't be possible...

ProofHouse • yesterday at 11:44 PM

The irony is thick

sabedevops • yesterday at 10:45 PM

Seriously. The hypocrisy is staggering!

wiseowise • today at 9:28 AM

Church, politicians, moralists are all the biggest hypocrites that want to teach you something.

➕ show 1 reply

zer00eyz • yesterday at 11:00 PM

" Integrity at OpenAI .. protect ... abuse like bots, scraping, fraud "

Did you mean to use the word hypocrisy. If not, I'm happy to have said it.

I just want to note, that it is well covered how good the support is for actual malware...

RobotToaster • today at 9:41 AM

"You're trying to kidnap what I've rightfully stolen!"

gib444 • today at 10:02 AM

And have absolutely no reservations about making such an obvious statement on a public forum

Aurornis • today at 12:25 AM

I interpreted scraping to mean in the context of this:

> we want to keep free and logged-out access available for more users

I have no doubt that many people see the free ChatGPT access as a convenient target for browser automation to get their own free ChatGPT pseudo-API.

➕ show 2 replies

rsrsrs86 • today at 3:18 PM

This

miki123211 • today at 8:43 AM

It's not scraping they're concerned about, it's abusing free GPU resources to (anonymously) generate (abusive) content.

nikitaga • yesterday at 11:53 PM

Scraping static content from a website at near-zero marginal cost to its server, vs scraping an expensive LLM service provided for free, are different things.

The former relies on fairly controversial ideas about copyright and fair use to qualify as abuse, whereas the latter is direct financial damage – by your own direct competitors no less.

It's fun to poke at a seeming hypocrisy of the big bad, but the similarity in this case is quite superficial.

➕ show 36 replies

heyethan • today at 2:14 AM

I think the distinction is less about scraping itself, and more about marginal cost.

Scraping static pages is cheap for both sides. Scraping an LLM-backed service effectively externalizes compute costs onto the provider.

Same behavior, very different economics.

➕ show 1 reply

alt Hacker News

Replies