“Distillation attack” are we joking here. If anything these models should be compelled to be publi...

onetrickwolf • today at 1:13 PM • 13 replies • view on HN

“Distillation attack” are we joking here.

If anything these models should be compelled to be public since they have been trained off public data. What an absurd overreach to call this an attack.

It’s clear they are scapegoating national security and China at this point to build an anti-competitive moat.

I generally really like Anthropic’s work and models but stuff like this scares me for the future. We are positioning these companies to have too much power. The public’s life is getting worse while these companies consolidate power using data they stole from the public.

Replies

_fat_santa • today at 2:45 PM

> If anything these models should be compelled to be public since they have been trained off public data

I'm starting to come around to this idea TBH. For a while my position was: "these companies have invested billions into training these models, therefore they should be able to control them and profit off them" but looking deeper at where they got their training data, my view is starting to shift.

IMHO I feel like we need new laws around AI, specifically training data. Something like: "you can train an AI model and ignore copyright laws, BUT you must then make the model open weight", a company can still develop closed weight models but then they must aquire permission to use training data.

But it gets murky because if something like that was on the books then AI labs would just train open weight models and then distill them into their closed weight models.

➕ show 2 replies

rafram • today at 2:33 PM

The core of the training data is public, but the part that actually makes these models smart came from (pretty highly-paid) experts via platforms like Mercor. Claude didn't magically learn to write good code by reading all of GitHub - humans trained it in that, more or less manually.

➕ show 5 replies

slibhb • today at 2:51 PM

> If anything these models should be compelled to be public since they have been trained off public data. What an absurd overreach to call this an attack.

> It’s clear they are scapegoating national security and China at this point to build an anti-competitive moat.

If all that is required to train these models is public data, why can't Alibaba just use that?

The fact that Alibaba has to resort to scraping Claude suggests there already is a moat...

➕ show 1 reply

flowerlad • today at 3:02 PM

Should Google search index be forced to be public too?

➕ show 1 reply

zobzu • today at 2:21 PM

its mainly just a lot cheaper. copying is always cheaper anyway, very little r&d - ai or no ai.

petilon • today at 2:59 PM

> If anything these models should be compelled to be public since they have been trained off public data.

Isn't that a bit like saying if you read books in a public library to pick up a new skill you should work for free?

> What an absurd overreach to call this an attack.

Would it be an attack to take your meal by force if you used a public recipe to prepare the meal?

➕ show 1 reply

rapind • today at 3:08 PM

> It’s clear they are scapegoating national security and China at this point to build an anti-competitive moat.

They are also fear mongering (and getting shills to as well) the idea that once open weight (Chinese) models catch up to Mythos we're all doomed. Maybe I'd be bit less cynical if they weren't prepping for IPO?

Wasn't OpenAI spreading similar FUD back when GPT 2 came out?

Guys... AGI is right around the corner. Pinky swear. Now buy our stock.

Keep in mind that the entire US economy is currently propped up by AI spending, so a lot of people (banks, government) are incentivized to make sure these companies succeed. Expect this propaganda to ratchet up a notch if / when the economy starts to nose dive.

➕ show 1 reply

cma • today at 2:35 PM

Since they hide their thinking traces it really doesn't make too much sense. We know one of their fixed degradations they talked about in a recent blog post was if you left claude code idle for too long they would rehydrate it without the thinking traces in the context and it degraded performance. So direct forms of distillation wouldn't be expected to get as good of results as they are getting.

However, they could have used it as a judge etc. during training.

msabalau • today at 5:27 PM

There's probably at 10-15% percent chance of a war between the US and China over the next 10 years. Maybe better than even chance of a militarized crisis that might have led to war, but somehow de-escalates.

Regardless of how sad late stage capitalism makes you, or how outrageous one claims to find "hypocrisy", any national security argument about limiting Chinese AI capability stands on it's own, at least for nations likely to be drawn into a war.

Also, all the local model enthusiasts who assume Chinese firms are going be allowed to endlessly release models if they have disruptive potential attributed to Mythos are probably in for a rude awakening. Just because the PRC is content about what has happened in the past doesn't mean that they would tolerate an open model that could be truly destabilizing.

➕ show 1 reply

coliveira • today at 2:34 PM

What they're trying to do under the umbrella of "national security" is to legislate how we can use the results we pay for when accessing these models. This way they will control the "intellectual property" that was acquired illegally.

TZubiri • today at 1:57 PM

Two wrongs don't make a right

➕ show 3 replies

rayiner • today at 3:17 PM

> The public’s life is getting worse while these companies consolidate power using data they stole from the public

How can you “steal” public information?

➕ show 1 reply

alt Hacker News

Replies