I really hate the term “guardrails” for these limitations, since the purpose of a guardrail is to protect me, but these limitations exist to protect Anthropic.
I’m on their CSP and can’t even get it to update my website. It’s totally unusable rn.
These guys always destroy a good thing, so trust is at stake
Would it be a costly process for Anthropic to re-tune those guardrails? Like, re-training sort of cost? or like coding session sort of cost?
I can’t help but think that gimping itself for “security” is a marketing ruse and it’s not actually as “dangerous” as they want people to think it is.
They are never happy :)
If a product is genuinely dangerous to society, self regulation cannot be a suitable harness.
If only we had effective governments that could regulate industry.
If a nuclear weapon was developed today, would it be down to industry to self regulate?
It refuses to do any legitimate work that it thinks can remotely be related with "cybersecurity", it won't even read my Docker app logs to try and troubleshoot a problem. Absolute garbage!
Fable is utterly useless with those guardrails for any serious it or life science work. Anthropic fucked me once a few months ago by closing down the subscription for any other harness, now it fucked me twice with buying again a subscription to find out their hyped model is unusable for normies. Using their products feels like a constant battle instead of a productive work day.. compare that with openai, not once did i feel like fighting against codex. Never again Anthropic..
I mean a lot of people were let into the CVP, I bet the group of people in there did a bunch of good fable 5 could do the exact same but better. Theres more good out there than bad.
DeepSeek is the only one that I can directly ask about vulnerabilities and it will give me a PoC. Although not as good as others, it has helped me with security research.
The rest have guard rails that are so heavy, it makes them almost useless for cybersecurity.
This is a pretty basic manipulation tactic. Be super shitty to your users and then roll back the abuse. The correct response is to not engage with shitty abusive dickheads.
It even refuses to read my resume, so... yeah
funny how wired got the masses of the internet on board with hating AI, helping to spark the whole anti-movement and people still continue to rely on them for their understanding of AI and current events.
I feel like they report in a vaccum. take this anti exfil policy for claude, it was plainly explained as part of the launch of Anthropics new product. Security like this isn't novel, it isn't bad, you don't explain how your security works to the people you're securing against. Nobody freaks out about Steam's VAC ban system, no one is investigating gmail's spam filtering, Reddits vote fuzzing, cloudflares bot detection, or Vercel for blocking proxying services.
whats really the distinguishing principle? Is it really just not liking Anthropic's opinions? then just say that and use a different llm. chemist, biologists, and AI researchers cry a river lmao
This is clearly advertising. But that's OK. OpenAI does the same thing.
Deliberately producing misaligned and deceitful AI systems now. Great.
Stupid security theater. The only thing that makes sense would be zero restrictions.
I said I wondered if the models were going to start poisoning distillation and I got downvoted to hell. It’s interesting to me that they are now downgrading ML research too in this model, I would argue this implies the terrifying and impossible to reason about self improving AI doom loop is coming sooner rather than later. Bit worrying.
Fable has been pretty disappointing for security research. It downgrades itself to Opus 4.8 even when you ask it questions about basic things like port scanning.
Software engineers shouldnt be happy either. If model silently sabotage cybersecurity research of others software there is abdolutely no way to be sure it wont be sabotaging cybersecurity of AI slop code it generated yesterday.
This is bad precedent and no one wants to pay X to generate code to then have to pay X*10 to figure out why your company just got hacked.
It's frustrating as someone who has worked hard to produce succinct, secure software that I can't use it to prove my software's correctness but big companies with insecure code can use it to fix their tangled mess.
I already tested all earlier models against all my open source projects and they are yet to find a vulnerability so I'm keen to try out Mythos.
I've been waiting to be vindicated for years and finally we have a tool which can do it with high confidence but I don't have access.
Also, my code is minimal and highly succinct so it would prove correctness with even more confidence since each library/module and integration fully fits in the context window.
Like the Protobuf.js fiasco is just pure vindication for me because I was being looked down upon for choosing JSON as the interchange format. Turns out their software was insecure all this time... With a literal remote code execution vulnerability!
It's a marketplace. Someone else will outdo this inferior product.
Related development:
Anthropic Walks Back Policy That Could Have 'Sabotaged' Researchers Using Claude
https://www.wired.com/story/anthropic-responds-to-backlash-o...
More discussion:
If Claude Fable stops helping you, you'll never know
https://news.ycombinator.com/item?id=48467896
and Related:
Claude Fable 5
This what that Anthropic CEO has been cooking all the time with his safety BS.
It's is expensive, and its shit, period.
Surely if they are sabotaging the output, they shouldn't charge the same fee for tokens as if the output was not sabotaged?
This is looking like something for regulator to look at and probably a class action lawsuit in the making.
I think people should be getting refunds. Including for shenanigans with Opus.
I'm being careful with it, but I haven't had Fable reject requests to "harden" my code or "find issues" in auth-related modules, which you could use on someone else's code to find vulnerabilities.
i think Anthropic is playing too fast-and-loose with the whole "no publicity is bad publicity" schtick.
Could it now start to add unnoticeable security holes into your system if you start writing security type code.
[flagged]
[flagged]
[flagged]
[flagged]
[flagged]
[flagged]
[dead]
[flagged]
[dead]
[dead]
[dead]
[dead]
[dead]
This is a clickbait article with a garbage title. From the actual article, the one quoted cybersecurity researcher is sane about it:
“But it is understandable as we are still in the early days and they are still adapting their guardrails. I am sure they are going to evolve over time as Anthropic and other frontier model companies will collaborate more with the current new generation of cybersecurity companies,” said Suiche, who is a member of the technical staff at Tolmo, an AI cybersecurity startup. “It’s better to catch more people than not enough when you do such a release and to relax the guardrails over time.”
Is the answer requiring licensing for certain use cases for AI? If you're asking questions that involve synthesising or modifying biologics, or anything that looks like cybersecurity research, you need to tie your real ID to the account?