If they succeed here, won't they have to gate access to this feature, too? For the same reasons...

andai • today at 8:10 AM • 7 replies • view on HN

If they succeed here, won't they have to gate access to this feature, too? For the same reasons as, "if I so much as mention mitochondria, it downgrades me to Opus."

Step 1. Make it so Claude can do anything — the whole point of AGI

Step 2. Wait, if the user can do Anything, that would be Very Bad!

Step 3. Err on the safe side with blanket bans of entire fields

The latter actually seems to me a sensible reaction to e.g. the compartmentalization used in the large scale cyber attack using Claude last year. Where they were able to do Bad Thing by dividing it into many, many Small, Seemingly Harmless Things.

Gated access sounds bad (and I agree it sounds bad!) but it might actually be the only sensible response to such a set of conditions. I'm not sure though.

I saw some studies recently which showed LLMs provide much more detailed information to expert users. So we can distinguish between competence and incompetence based on use of language, and that is a reasonable metric for harm reduction.

But I don't think we can reliably detect "user has harmful intentions", at least not at a sufficient level of sophistication of the attacker.

Replies

MiracleRabbit • today at 8:16 AM

I just wait until a Opensource model inhaled all the chemistry books and papers. Lol.

They are following closely and the best offer 80-90% of the performance and come with a very small fraction of the costs.

➕ show 2 replies

brookst • today at 2:58 PM

Maybe? It seems like their strategy is to accept that safety is imperfect, err on the side of over-triggering, and iterate.

I don’t think it’s a black and white “if fable 5 over triggers on bio safety in 2026, that’s the final pattern we should expect to see from post-Mythos 20 in 2036”

inigyou • today at 2:30 PM

It's a good marketing strategy though, since nobody can disprove that Claude is the best chemist :)

NewsaHackO • today at 8:29 AM

I think one time I asked opus about copyfail when it just came out and it did treat me like some sort of criminal, but are there really people that run into this on a regular basis other than cybersecurity experts (which cannot be a big enough group to generate all of this criticism)?

➕ show 1 reply

siva7 • today at 2:27 PM

"Making Claude a chemist" is a punch in the face of all us here after we got a taste of Claude fable blocking almost any STEM-related topic. So i don't know what Story Anthropic wants to tell us with this blog. Probably more like "Hey, look at what awesome tricks Claude can do which of course you will never experience because you're not american, not friends with the CEO, not rich or simply because fuck you"

➕ show 1 reply

UltraSane • today at 9:47 AM

Fable wouldn't even explain what an amino acid is.

chaostheory • today at 2:05 PM

It would make more sense to have specialized versions of Claude instead of having a universal one ie BioClaude, ProgrammerClaude, NuclearClaude, CyberSecurityClaude, et al

This would be both safer and less annoying to use.

➕ show 1 reply

alt Hacker News

Replies