logoalt Hacker News

Imnimoyesterday at 10:38 PM15 repliesview on HN

It's interesting to me that OpenAI considers scraping to be a form of abuse.


Replies

DrinkyBirdtoday at 12:49 PM

It’s funny because the first AI scraper I remember blocking was from OpenAI’s, as it got stuck in a loop somehow and was impacting the performance of a wiki I run. All to violate every clause of the CC BY-NC-SA license of the content it was scraping :)

raincoletoday at 5:07 AM

Quite sure even literal thieves would consider thievery a form of abuse.

show 3 replies
jordanbtoday at 2:37 PM

They don't want anyone to take that which they have rightfully stolen.

show 2 replies
axegon_today at 7:36 AM

The levels of irony that shouldn't be possible...

ProofHouseyesterday at 11:44 PM

The irony is thick

sabedevopsyesterday at 10:45 PM

Seriously. The hypocrisy is staggering!

wiseowisetoday at 9:28 AM

Church, politicians, moralists are all the biggest hypocrites that want to teach you something.

show 1 reply
zer00eyzyesterday at 11:00 PM

" Integrity at OpenAI .. protect ... abuse like bots, scraping, fraud "

Did you mean to use the word hypocrisy. If not, I'm happy to have said it.

I just want to note, that it is well covered how good the support is for actual malware...

RobotToastertoday at 9:41 AM

"You're trying to kidnap what I've rightfully stolen!"

gib444today at 10:02 AM

And have absolutely no reservations about making such an obvious statement on a public forum

Aurornistoday at 12:25 AM

I interpreted scraping to mean in the context of this:

> we want to keep free and logged-out access available for more users

I have no doubt that many people see the free ChatGPT access as a convenient target for browser automation to get their own free ChatGPT pseudo-API.

show 2 replies
rsrsrs86today at 3:18 PM

This

miki123211today at 8:43 AM

It's not scraping they're concerned about, it's abusing free GPU resources to (anonymously) generate (abusive) content.

nikitagayesterday at 11:53 PM

Scraping static content from a website at near-zero marginal cost to its server, vs scraping an expensive LLM service provided for free, are different things.

The former relies on fairly controversial ideas about copyright and fair use to qualify as abuse, whereas the latter is direct financial damage – by your own direct competitors no less.

It's fun to poke at a seeming hypocrisy of the big bad, but the similarity in this case is quite superficial.

show 36 replies
heyethantoday at 2:14 AM

I think the distinction is less about scraping itself, and more about marginal cost.

Scraping static pages is cheap for both sides. Scraping an LLM-backed service effectively externalizes compute costs onto the provider.

Same behavior, very different economics.

show 1 reply