I think the reasonable middle ground anthropic is trying to achieve is - let the organizations that ...

bs7280 • today at 4:51 PM • 4 replies • view on HN

I think the reasonable middle ground anthropic is trying to achieve is - let the organizations that make the most important and critical software get a head start on cybersecurity before they inevitably allow everyone else the same access.

Other commentors have made good points that these guardrails are counter productive for well intentioned cyber security, because I can't use it to test and harden my own software.

Replies

sciencejerk • today at 4:56 PM

Claude Opus 4.6 and 4.8 find vulns in source code just fine and 4.6 will pentest without source for you given a proper harness WITHOUT jailbreaking. WITH jailbreaks, you can probably imagine what they are capable of.

Anthropic guardrails seem to be more about protecting their business (distillation), than they are about public safety.

➕ show 1 reply

ryandrake • today at 4:56 PM

I wonder who gets to decide which companies make important and critical software and which ones get the scraps later.

➕ show 2 replies

wouldbecouldbe • today at 5:00 PM

I asked it to analyse my architecture and find any security issues and it did it perfectly, first identified the issues & then fixed them. Not sure why my prompt managed to get through the guardrails

➕ show 1 reply

notrealyme123 • today at 4:56 PM

exactly for cybersecurity the failure was visible. It was not visible for "Frontier" ML Research. The argument of headstart in it security is no feasible here.

alt Hacker News

Replies