Awesome and very interesting posts, thanks for sharing! Always reminds me of the "lethal trifec...

sunaookami • last Friday at 9:55 PM • 1 reply • view on HN

Awesome and very interesting posts, thanks for sharing! Always reminds me of the "lethal trifecta": https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/

Replies

veganmosfet • yesterday at 9:48 AM

You're welcome!

My main takeaway message is: models (even opus4.6) do not follow security "instructions" reliably. In OpenClaw, they added security warnings, tags, random IDs... None of these countermeasures work reliably. Even sandboxing can be escaped (not in the classical sense using vulnerabilities, but using multi-layered prompt injection payload with natural language only)[0]. As soon as untrusted content is injected in the context, do not trust any actions downstream.

[0] https://itmeetsot.eu/posts/2026-02-15-openclaw_sandbox/

alt Hacker News

Replies