Genuine question, why have you chosen to phrase this scraping and distillation as an attack? I'...

jdiff • last Thursday at 6:13 PM • 4 replies • view on HN

Genuine question, why have you chosen to phrase this scraping and distillation as an attack? I'm imagining you're doing it because that's how Anthropic prefers to frame it, but isn't scraping and distillation, with some minor shuffling of semantics, exactly what Anthropic and co did to obtain their own position? And would it be valid to interpret that as an attack as well?

Replies

DrammBA • last Thursday at 7:01 PM

> I'm imagining you're doing it because that's how Anthropic prefers to frame it

Correct.

> would it be valid to interpret that as an attack as well?

Yup.

irthomasthomas • last Thursday at 6:22 PM

If you ask claude in chinese it thinks its deepseek.

typ • last Friday at 2:22 AM

I don't think that learning from textbooks to take an exam and learning from the answers of another student taking the exam are the same.

Joking aside, I also don't believe that maximum access to raw Internet data and its quantity is why some models are doing better than Google. It seems that these SoTA models gain more power from synthetic data and how they discard garbage.

fragmede • last Thursday at 8:50 PM

Firehosing Anthropic to exfiltrate their model seems materially different than Anthropic downloading all of the Internet to create the model in the first place to me. But maybe that's just me?

➕ show 3 replies

alt Hacker News

Replies