Did mythos isolate the code to begin with? Without a clear methodology that can be attempted with an...

amazingamazing • today at 5:22 PM • 3 replies • view on HN

Did mythos isolate the code to begin with? Without a clear methodology that can be attempted with another model the whole thing is meaningless

Replies

bhouston • today at 5:43 PM

They did do one agent per code chunk, yes. But key is that their agent had to identify when there was a vulnerability and when there wasn't. This "small model" test only had to label the known positive cases as positive -- which any function that simply returns "true" can do. This whole test setup is annoying because it proves nothing.

aniceperson • today at 5:25 PM

to be fair, last post i saw from anthropic on finding linux kernel vulnerability was a while loop per failed prompting "there is a vulnerability here, find it" more important than that, no frontier model can keep the entire linux kernel in context, so there definitely is code isolation, either explicitly or implicitly (the model itself delegates subagents with smaller chunks of code)

loeg • today at 5:42 PM

No. How would it? Before the vulns were identified by Mythos, no one knew what the relevant portion to isolate was.

alt Hacker News

Replies