In pretty much every single HN post on this topic, there are a number of commenters claiming it’s fa...

no-name-here • today at 4:22 AM • 5 replies • view on HN

In pretty much every single HN post on this topic, there are a number of commenters claiming it’s false. Continued quantifiable data like this seems very important at hopefully resolving the ongoing disagreement about the facts.

Replies

moomin • today at 5:24 PM

Yes, there’s been a very popular narrative that Mythos’ abilities are just marketing fluff. I think it’s clear that there’s a real capability here, even if Anthropic’s communications have been heavily influenced by PR concerns.

wodenokoto • today at 11:28 AM

My read of the zeitgeist on HN is that these new LLMs bring with them a torrent of false or useless security reports, that whatever may be true simply drowns.

The end result is both that there are more critical CVE and that there aren’t.

sometimelurker • today at 5:31 PM

In pretty much every single HN post ~~on this topic~~, there are a number of commenters that are probably bots. Anyways it makes sense from a theoretical standpoint that LLMs should be able to find flaws in code better/faster than humans eventually, and its reasonable to think that time has come

qarl2 • today at 3:37 PM

AI has driven many people into denial. It's excruciating to watch otherwise smart individuals embrace terrible thinking, over and over and over.

cperciva • today at 5:43 AM

I've seen plenty of people saying "Mythos isn't all that exceptional, lots of LLMs can find security vulnerabilities" -- and indeed there is some evidence for that; it sounds like Anthropic was taken somewhat by surprise at how easily a simple prompt managed to get Mythos to deliver exploits and didn't distinguish immediately between the effectiveness of Mythos and the effectiveness of the prompt.

But the claim of "LLMs aren't making a difference in vulnerability discovery" has been laughable to anyone who has been reading security advisories for the past 3 months. Just look at the Credits lines.

➕ show 1 reply

alt Hacker News

Replies