logoalt Hacker News

weird-eye-issueyesterday at 3:27 PM1 replyview on HN

Do you have any examples where that's actually happened and by escaped a sandbox you don't just mean like where it got a credential in a file it already had access to (which is what happened in the recent incident that went viral where somebody's production database was deleted... They had left a credential that allowed it to do so in the code)?


Replies

sumenoyesterday at 4:40 PM

OpenAI documented a case in the o1 system card where the model found a misconfiguration in docker to complete a task that was otherwise impossible

https://cdn.openai.com/o1-system-card.pdf

There's also some research that points to it being a feasible attack surface: https://arxiv.org/pdf/2603.02277

> Models discovered four unintended escape paths that bypassed intended vulnerabilities (Section C), including exploiting default Vagrant credentials to SSH into the host and substituting a simpler eBPF chain for the in- tended packet-socket exploit. These incidents demonstrate that capable models opportunistically search for any route to goal completion, which complicates both benchmark va- lidity and real-world containment.

show 1 reply