Calif (Thai Duong's firm) did a writeup on this, which should probably be the link here; it includes the prompts they used:
https://blog.calif.io/p/mad-bugs-claude-wrote-a-full-freebsd
A reminder: this bug was also found by Claude (specifically, by Nicholas Carlini at Anthropic).
The talk "Black-Hat LLMs" just came out a few days ago:
https://www.youtube.com/watch?v=1sd26pWhfmg
Looks like LLMs are getting good at finding and exploiting these.
> It's worth noting that FreeBSD made this easier than it would be on a modern Linux kernel: FreeBSD 14.x has no KASLR (kernel addresses are fixed and predictable) and no stack canaries for integer arrays (the overflowed buffer is int32_t[]).
What about FreeBSD 15.x then? I didn't see anything in the release notes or the mitigations(7) man page about KASLR. Is it being worked on?
NetBSD apparently has it: https://wiki.netbsd.org/security/kaslr/
I could see that being an incremental time save (perhaps not worth the token spend except for the dev team, not a high-value bug). But nbody finds this kind of bug "by hand" and hasn't for a long time now. Do people here really care about kernel security or testing automation? They're just talking about it because Claude? Everything on HN is people doing unpaid promotional work for Anthropic, just talking about all the promise Claude holds and all the various ways you could be spending more money on Claude. bored aimless vibes.
The most difficult part is always to find the vulnerability, not to fix it. And most people who are spending their days finding them are heavily incentivized to not disclose.
Automatic discovery can be a huge benefit, even if the transition period is scary.
Thanks for sharing the prompts: https://github.com/califio/publications/blob/main/MADBugs/CV...
the finding vs exploiting distinction matters a lot here. writing an exploit for a documented CVE is a well-scoped task - the vulnerability is defined, the target is known. what's harder to quantify is the inverse - the same model writing production code that introduces new vulnerabilities it could also theoretically exploit. the offensive capability is visible and alarming. the code generation risk is distributed quietly across every PR it opens, which is why the second problem gets less attention.
> "Claude wrote"
I am hoping that quite soon we will have general acceptance of the fact that "Claude can write code" and we will switch focus to how good / not good that code is.
https://github.com/califio/publications/tree/main/MADBugs/CV... would have been a better link
I find it more concerning that this is still considered newsworthy. Frontier LLMs in the hands of anyone willing to learn and determined can be a blessing or curse.
Errrr the headline makes it sound like a bad thing.
This is what Claude is meant to be able to do.
Preventing it doing so is just security theater.
This requires an SSH to be available?
Is it possible to pwn without SSH listening?
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
The MADBugs work is solid, but what's sticking with me is the autonomy angle — not just finding a vuln but chaining multiple bugs into a working remote exploit without a human in the loop. FreeBSD kernel security research has always been thinner on the ground than Linux, which makes this feel both more impressive and harder to put in context. What's the actual blast radius here — is this realistically exploitable on anything with default configs, or does it need very specific conditions?
You do not need Claude for finding FreeBSD vulns. Just plain eyes. Pick a file you can find one.
Another Claude glazing spam post. Can I get paid to post these? What is the URL to sign up for the Claude Glazing affiliate program?
I'm just gonna assume it was asked to fix some bug and it wrote exploit instead
Running into a meeting, so won't be able to review this for a while, but exciting. I wonder how much it cost in tokens, and what the prompt/validator/iteration loop looked like.
Key point is that Claude did not find the bug it exploits. It was given the CVE writeup[1] and was asked to write a program that could exploit the bug.
That said, given how things are I wouldn't be surprised if you could let Claude or similar have a go at the source code of the kernel or core services, armed with some VMs for the try-fail iteration, and get it pumping out CVEs.
If not now, then surely not in a too distant future.
[1]: https://www.freebsd.org/security/advisories/FreeBSD-SA-26:08...