Anthropic announces proof of distillation at scale by MiniMax, DeepSeek,Moonshot

75 points • by Jimmc414 • today at 6:33 PM • 68 comments • view on HN

Comments

Just reading the headline, I say good.

A) These models are trained by ignoring IP. It is hypocritical and absurd to then try to assert IP over them. And I am for the destruction of IP on all ends.

B) What this essentially means is that the Chinese labs are taking the work of these mega corporations into making it freely accessible to other labs and businesses, to serve inference, fine tune, and host privately on prem. That's clearly a good thing for competition in the market as a whole.

C) I don't see why we should have to duplicate the massive energy and infrastructure investment of building foundation models over and over forever just because we want to preserve the IP rights of a few companies. That seems a shame and it seems better to me for everything to learn from everything else for the whole ecosystem to get better by topping each other and building off each other; that's also why publishing research into the architecture and training of these models is so much better than what the proprietary labs do (keeping everything a secret), although tbf Anthropic's interpretability research is cool.

D) these Chinese models give 90% of the performance of frontier proprietary models at a 10th or 20th of the cost. That seems like a win for everyone. Not to mention the fact that this distilling also allows them to make much smaller local models that everyone can run. This is a win for actual democratization, decentralization, and accessibility for the little guy.

➕ show 4 replies

impulser_ • today at 6:44 PM

Why would anyone care about this at all?

MiniMax, DeepSeek, and Moonshot are all releasing models for the public to use for free.

Anthropic, OpenAI, Google ect have been scraping information to train their models that they had no right in scraping yet when these company pay them to scrap data we are suppose to be worried?

Labs like Anthropic always preach we are trying to build AI for everyone while releasing expensive models that are closed source.

The only reason AI is affordable at all is because of these Chinese AI labs.

➕ show 5 replies

paxys • today at 7:12 PM

It's crazy for their official account to post this when Anthropic itself is fighting multiple high-profile lawsuits over its unauthorized use of proprietary content to train its models. Did no one run this by legal?

➕ show 2 replies

MiSeRyDeee • today at 7:27 PM

Kudos to them then, for doing such a good job at distillation. Only 16 million chats(shared by multiple labs/models) needed for distillation for getting mostly on par performance at 1/10th - 1/50th cost, keep up keeping up!

cs702 • today at 6:47 PM

It's been known for a long while that model outputs = data for training another model to copy the original model's behavior, also known as distillation.

What I didn't know is that the three groups mentioned "created over 24,000 fraudulent accounts and generated over 16 million exchanges with Claude, extracting its capabilities to train and improve their own models." There's some irony in that, given that Anthropic and all other established AI shops have been criticized for using copyrighted materials without permission to train their own models. I wouldn't be shocked if we subsequently find out tat every major AI shop has secretly engaged in distillation at some point in the past.

Still, wow, 24,000 accounts. I can't help but wonder, how many other AI shops have surreptitious accounts with other AI shops right now?

➕ show 3 replies

armcat • today at 7:25 PM

This is such an insane rabbit hole. AI labs distill weights from the entirety of the internet knowledge, (mostly) without anyone's consent, which (technically) amounts to theft. However the chinchilla law dictates you need to expend X amount of energy to make this knowledge useful. Then the data law dictates that you need to shift the weights to a more useful latent space by paying maths, coding and domain experts lots of money. So you have "stolen" the data, but then paid billions to make it useful. And useful it is!

Then another lab comes, and "steals" from you - that beautiful, refined dataoil - by distilling your weights using inferior equipment but with a toolbox of ingenuity and low-level hacking tricks. They reach 90% of your performance at 20x cost reduction.

What happens when another lab distills from the distilled lab?

Who is the thief? How far will the Alice go?

➕ show 2 replies

mudkipdev • today at 7:44 PM

Don't throw stones from glass houses. Ask Anthropic about the proxies they use for scraping. They're well-versed on the topic

StarterPro • today at 7:43 PM

> Distillation is a widely used and legitimate training method.

Oh ok, so you can steal from everyone, but when they do it to you, its bad.

falcor84 • today at 6:42 PM

Interesting, and my main take away is that ~16 million sessions is enough to distill Claude. That's extremely doable - obviously, as it's been done repeatedly - but it just looks very feasible in general.

If I think of the number of lessons and educational conversations that a human would have to acquire their lifetime knowledge, I would hazard to say that AI-to-AI learning no longer requires many orders of magnitude beyond that.

➕ show 1 reply

osiris970 • today at 7:14 PM

It's not illegal, just agaisnt their TOS. Your job to deal with that anthropic lol

➕ show 1 reply

aquir • today at 7:02 PM

Not nice but the frontier labs "distilled the whole internet" using the common crawl.

➕ show 1 reply

throwfaraway4 • today at 6:46 PM

Company that rips-off creators to build their product complains other companies are doing the same to them.

xanthor • today at 7:14 PM

Ironic phrasing used here. China is the only country that actually has the capacity to deeply integrate AI into industrial manufacturing in a way that will reduce costs of goods. They already have lights-off autonomous factories without AI.

devnonymous • today at 7:45 PM

> These labs created over 24,000 fraudulent accounts and generated over 16 million exchanges with Claude

What exactly makes these accounts ^fraudulent^ ...did they not pay Anthropic for the service ?

oncallthrow • today at 7:04 PM

Live by the sword, die by the sword

➕ show 2 replies

iamsaitam • today at 7:27 PM

At least they paid you for it.. unlike you

iagorodriguez • today at 7:14 PM

I was not emotionally prepared for this level of humor today, its Monday, please!

karmasimida • today at 7:22 PM

Unless they stop selling APIs to the public this can’t be stopped.

Mind you that nuclear weapons are able to be regulated not because the tech itself is secret, it is because the refining is nation state effort, that is impossible to go unnoticed.

Realistically, the more tokens they are selling, the harder they can control it

kgeist • today at 7:23 PM

Were those 16 mln sessions used only for alignment, chat format, reasoning, etc.? Or it's possible to train a base model too? If a single session is at least 32k tokens, then it's already 0.5 trillion tokens to train on, interesting.

m_ke • today at 6:58 PM

we should probe anthropic for what accounts they made to access third party data, or which proxies they use to circumvent scraping blockers

veselin • today at 7:23 PM

I think they put two things:

* Likely they will seek regulation that would ban some models. Not sure this can work, but they will certainly try.

* Likely they will not release some of their next models in the API.

➕ show 1 reply

lousken • today at 7:22 PM

Good, if you don't release open weights, someone else does.

ks2048 • today at 7:24 PM

I wonder how much American labs do the same.

int32_64 • today at 7:03 PM

The company that claims all knowledge workers are going to be wiped out by their technology is asking these future disenfranchised workers to care about the Chinese ripping off their tech. That seems like a hard no.

sidgarimella • today at 7:24 PM

would sure be nice if the effort spent fighting their karma was pointed at a better frontier model

gregman1 • today at 7:03 PM

Do we need to re-announce proof of dirty practices by Anthropic?

ralph84 • today at 7:29 PM

Human knowledge belongs to humanity. Of course the people who want to paywall it and extract rent will try to concoct some ethical basis for their rent seeking. Anthropic appears to be choosing the xenophobic route.

➕ show 1 reply

zb3 • today at 7:21 PM

> But foreign labs that illicitly distill American models can remove safeguards

I hope so, I don't need their "safeguards".

ChrisArchitect • today at 7:19 PM

Some more discussion on source: https://www.anthropic.com/news/detecting-and-preventing-dist... (https://news.ycombinator.com/item?id=47126177)

eagleinparadise • today at 7:22 PM

world's smallest violin meme

gostsamo • today at 7:07 PM

"You are trying to kidnap what I have rightfully stolen, and I think it quite ungentlemanly."

rsynnott • today at 6:56 PM

Oh, now we care about IP, do we?

bakugo • today at 7:24 PM

Anthropic leadership once again showing off a remarkable level of immaturity.

Of course they don't want anyone else to use the precious outputs from the model they created by scraping data from the millions of fleshbag programmers they're now trying to put out of a job. They're just another corporation with the standard goal of making as much money as possible with little regard for anything else, so that much is expected.

But to actually write up a public announcement like this, loudly and proudly announcing to the world that they're crying at the daycare because their precious toy has been stolen by some kid, even though everyone around them knows they themselves originally stole that toy from another kid too, takes a special kind of corporate shamelessness that seems to be becoming more prevalent by the day.

stefan_ • today at 7:18 PM

Anthropic, of course, ran an industrial-scale distillation attack on the combined works of human mankind. So, uh.. kindly go fuck yourself? Who asked?

grezql • today at 7:24 PM

[dead]

alt Hacker News

Anthropic announces proof of distillation at scale by MiniMax, DeepSeek,Moonshot

Comments