It looks like proof of work because: > Worryingly, none of the models given a 100M budget showe...

snowwrestler • yesterday at 9:08 PM • 2 replies • view on HN

It looks like proof of work because:

> Worryingly, none of the models given a 100M budget showed signs of diminishing returns. “Models continue making progress with increased token budgets across the token budgets tested,” AISI notes.

So, the author infers a durable direct correlation between token spend and attack success. Thus you will need to spend more tokens than your attackers to find your vulnerabilities first.

However it is worth noting that this study was of a 32-step network intrusion, which only one model (Mythos) even was able to complete at all. That’s an incredibly complex task. Is the same true for pointing Mythos at a relatively simple single code library? My intuition is that there is probably a point of diminishing returns, which is closer for simpler tasks.

In this world, popular open source projects will probably see higher aggregate token spend by both defenders and attackers. And thus they might approach the point of diminishing returns faster. If there is one.

Replies

SyneRyder • yesterday at 11:33 PM

Worth pointing out that as impressive as the 32-step network takeover is, Mythos wasn't able to achieve it on every attempt, and the network itself did not have the usual defence systems.

I wouldn't use those as excuses to dismiss AI though. Even if this model doesn't break your defences, give it 3 months and see where the next model lands.

janalsncm • yesterday at 9:43 PM

Knowing nothing about cybersecurity, maybe the question is whether it costs more tokens to go from 32 steps to 33, or to complete the 33rd step? If it’s cheaper to add steps, or if defense is uncorrelated but offense becomes correlated, it’s not as bad as the article makes it seem.

For instance, if failing any step locks you out, your probability of success is p^N, which means it’s functionally impossible with enough layers.

➕ show 1 reply

alt Hacker News

Replies