News just broke in this Wired story: "Anthropic Walks Back Policy That Could Have ‘Sabotaged’ A...

simonw • today at 3:56 AM • 12 replies • view on HN

News just broke in this Wired story: "Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude" https://www.wired.com/story/anthropic-responds-to-backlash-o...

> “We’re changing Fable 5’s safeguards for frontier LLM development to make them visible.” Anthropic said in a statement to WIRED. “We made the wrong tradeoff and we apologize for not getting the balance right.”

Sounds like the widespread condemnation worked.

Replies

Grimblewald • today at 7:48 AM

Corporate America never backs down. It simply rallies and tries again later until people are too fatigued to care. The only solution is to abandon ship, which I am doing. MS walked back in OS ads the first few times, but ultimately we still ended up on the exact trajectory everyone was outraged at. OpenAI still ended up on its path to closed AI despite initial walk backs. The story repeats itself over and over again, so, once the bad behavior starts, you leave. Their apologies are as hollow as their moral posturing.

➕ show 4 replies

h6d_100c • today at 4:16 AM

To late. I canceled my Max subscription. The idea they would even do this is so destroyed any remaining trust. Why would I pay them 1000s of dollars in extra usage per month for something they could still be doing behind the scenes? Any errors previously chalked up to thinking effort or other backend changes? Maybe it was intentional prompt injection the entire time.

➕ show 5 replies

hedgehog • today at 4:05 AM

The "tradeoff" warning implies they stand by their thinking and don't think there was anything qualitatively wrong with it which, if nothing else, is helpful so potential customers can know how they think. I think the core lesson is if you want reliable infrastructure to build into an application you should use a different provider. (edit: I'm not specifically an Anthropic hater, but having just spent some time adding complexity to an app to deal with the existing refusal behavior in Sonnet... I understand why they might want this in an end user chatbot but for an API it's really not acceptable)

➕ show 1 reply

consumer451 • today at 11:02 AM

The other major thing is almost as bad, and actually maybe even worse for trust of AI features in b2b apps:

> Anthropic requires 30 day data retention for Fable and Mythos

https://news.ycombinator.com/item?id=48464258

I used to be able to tell my enterprise customers something simple, that I really believe: "We use Anthropic models via Bedrock/Azure, therefore we are guaranteed that your data will not be used for training models."

That simple blanket statement is no longer true. Also, most normal people/customers only read headlines, and this is a huge story. From my point of view, as someone deploying LLMs in my apps, trust comms with my clients just got set back two years.

➕ show 2 replies

pseudosavant • today at 5:09 AM

They are still downgrading. They just aren't doing it silently. I don't know how big of a win that is? They still trained on everyone else's data without license or attribution but want to prevent someone else from doing the same thing to them.

Some pretty audacious hypocrisy from Anthropic this week.

➕ show 2 replies

selicos • today at 7:24 AM

If any work is blocked/etc, refund all credits from that session/last X minutes. Minimum.

bostik • today at 4:52 AM

They need to walk back a lot more.

Unilaterally revoking zero-data retention, even for enterprise contracts that explicitly require that? Nope.

Fable is utterly unusable for any kind of security work. I tripped the safeguards yesterday - using Fable to dig into a complex (& annoying) security bug that has so far resisted both human and Opus 4.8 level investigation. "Sorry Dave, I can't let you do that."

For the time being we are requesting Anthropic disable Fable for our enterprise and turn ZDR back on. The two may be interlinked so that one will always get neither or both. ZDR is a contractual obligation. Fable in its current form is useless. Might as well flip the old behaviour on and avoid burning money for no reason while this mess is being sorted out.

➕ show 5 replies

Aperocky • today at 5:42 AM

I don't think it's the widespread condemnation, I think it's some high paying customer and potential investor telling them to stick it.

nl • today at 6:40 AM

This is different to the cyber limitations though.

To be precise - it makes the "won't work on frontier machine learning" refusal the same as the "won't work on cyber security" refusal (instead of the way it previously would work on frontier machine learning problems but give sub-optimal answers without informing the user)

➕ show 1 reply

rafram • today at 4:31 AM

The mitigations against distillation are separate, and not what the OP is about at all.

AussieWog93 • today at 6:41 AM

Non-paywalled: https://archive.md/yxYhU

alt Hacker News

Replies