I’ve said similar in another thread[1]:
Sandboxes will be left in 2026. We don't need to reinvent isolated environments; not even the main issue with OpenClaw - literally go deploy it in a VM* on any cloud and you've achieved all same benefits. We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc
——-
Unfortuently it’s been a pretty bad week for alignment optimists (meta lead fail, Google award show fail, anthropic safety pledge). Otherwise… Cybersecurity LinkedIn is all shuffling the same “prevent rm -rf” narrative, researchers are doing the LLM as a guard focus but this is operationally not great & theoretically redundant+susceptible to same issues.
The strongest solution right now is human in the loop - and we should be enhancing the UX and capabilities here. This can extend to eventual intelligent delegation and authorization.
[1] https://news.ycombinator.com/threads?id=ramoz&next=47006445
* VM is just an example. I personally have it running on a local Mac Mini & docker sandbox (obviously aware that this isnt a perfect security measure, but I couldnt install on my laptop which has sensitive work access).
> I’ve said similar in another thread[1]
Me too, at [1].
We need fine-grained permissions at online services, especially ones that handle money. It's going to be tough. An agent which can buy stuff has to have some constraints on the buy side, because the agent itself can't be trusted. The human constraints don't work - they're not afraid of being fired and you can't prosecute them for theft.
In the B2B environment, it's a budgeting problem. People who can spend money have a budget, an approval limit, and a list of approved vendors. That can probably be made to work. In the consumer environment, few people have enough of a detailed budget, with spending categories, to make that work.
Next upcoming business area: marketing to LLMs to get them to buy stuff.
> meta lead fail, Google award show fail
Can I get some links / context on this please
> literally go deploy it in a VM on any cloud
Sure, but now you're adding extra cost, vs just running it locally. RAM is also heavily inflated thanks to Sam Altman investment magic.
> We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc
At the same time, let's not let the perfect be the enemy of good.
If you're piloting an aircraft, yeah, you should have perfection.
But if you're sending 34 e-mails and 7 hours of phone calls back and forth to fight a $5500 medical bill that insurance was supposed to pay for, I'd love for an AI bot to represent me. I'd absolutely LOVE for the AI bot to create so much piles of paperwork for these evil medical organizations so that they learn that I will fight, I'm hard to deal with, and pay for my stuff as they're supposed to. Threaten lawyers, file complaints with the state medical board, everything needs to be done. Create a mountain of paperwork for them until they pay that $5500. The next time maybe they'll pay to begin with.
What could "human in the loop" be here but just literally reading your own emails?
> We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc
Isn’t this the whole point of the Claw experiment? They gave the LLMs permission to send emails on their behalf.
LLMs can not be responsibility-bearing structures, because they are impossible to actually hold accountable. The responsibility must fall through to the user because there is no other sentient entity to absorb it.
The email was supposed to be sent because the user created it on purpose (via a very convoluted process but one they kicked off intentionally).