LLMs can live in the cloud, but all tools need to be (1) local, and (2) containerized. It's clear to me that just willy-nilly "running stuff" is going to blow things up eventually. Maybe folks don't know this, but even Codex installs random binaries on your PC. "Read this PDF" installs a pdf reader executable. Is it vetted? Where's it from? Is it a virus? Who knows, who cares. Model goes brrrr.
I'm working on a project that includes WASI containerization for local LLM workflows (which is a pretty tough problem), and I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors. It feels like amateur hour.
> I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors. It feels like amateur hour
I share your concern but it's not a correct characterisation to say they are not taking it seriously:
https://www.anthropic.com/engineering/how-we-contain-claude
My concern is people aren't even addressing this at the right level. People are currently thinking at the level of "how do I build a VM to contain this one agent" when this is actually a "design a whole new OS" level problem.
I share your worries.
Unfortunately, this may be akin to the situation of "The market can stay irrational longer than you can stay solvent."
Does containerization help much here? If it's a code tool then presumably it needs access to your code files (read / write). Maybe there are use cases for it of course.
Got a link to your project? I'm working on something that could make use of something like this.
>"Read this PDF" installs a pdf reader executable.
How does this work regarding Macos notarization btw?
They’ll all be offering to run from the cloud with the next 3-4 months.
> I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors
They are well aware of the issues and there is no fix for it. But there is too much money riding on this...
> I'm working on a project that includes WASI containerization for local LLM workflows
I am working on something similar. If you are open to connecting, what would be a good email to catch with you on?
> I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors. It feels like amateur hour.
"Move fast. Break things." on steroids.
> I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors
Yep. We tricked them both trivially with malicious fonts in Docx files. Documented it here: https://tritium.legal/blog/noroboto
I wonder if prompt injection (and the thousands of vectors for hiding injection attempts) is actually un solvable. Discussing it may be existential to the business model.