logoalt Hacker News

ramoztoday at 6:30 PM7 repliesview on HN

I’ve said similar in another thread[1]:

Sandboxes will be left in 2026. We don't need to reinvent isolated environments; not even the main issue with OpenClaw - literally go deploy it in a VM* on any cloud and you've achieved all same benefits. We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc

——-

Unfortuently it’s been a pretty bad week for alignment optimists (meta lead fail, Google award show fail, anthropic safety pledge). Otherwise… Cybersecurity LinkedIn is all shuffling the same “prevent rm -rf” narrative, researchers are doing the LLM as a guard focus but this is operationally not great & theoretically redundant+susceptible to same issues.

The strongest solution right now is human in the loop - and we should be enhancing the UX and capabilities here. This can extend to eventual intelligent delegation and authorization.

[1] https://news.ycombinator.com/threads?id=ramoz&next=47006445

* VM is just an example. I personally have it running on a local Mac Mini & docker sandbox (obviously aware that this isnt a perfect security measure, but I couldnt install on my laptop which has sensitive work access).


Replies

bee_ridertoday at 6:47 PM

> We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc

Isn’t this the whole point of the Claw experiment? They gave the LLMs permission to send emails on their behalf.

LLMs can not be responsibility-bearing structures, because they are impossible to actually hold accountable. The responsibility must fall through to the user because there is no other sentient entity to absorb it.

The email was supposed to be sent because the user created it on purpose (via a very convoluted process but one they kicked off intentionally).

show 1 reply
Animatstoday at 6:48 PM

> I’ve said similar in another thread[1]

Me too, at [1].

We need fine-grained permissions at online services, especially ones that handle money. It's going to be tough. An agent which can buy stuff has to have some constraints on the buy side, because the agent itself can't be trusted. The human constraints don't work - they're not afraid of being fired and you can't prosecute them for theft.

In the B2B environment, it's a budgeting problem. People who can spend money have a budget, an approval limit, and a list of approved vendors. That can probably be made to work. In the consumer environment, few people have enough of a detailed budget, with spending categories, to make that work.

Next upcoming business area: marketing to LLMs to get them to buy stuff.

[1] https://news.ycombinator.com/item?id=47132273

g_delgado14today at 6:39 PM

> meta lead fail, Google award show fail

Can I get some links / context on this please

show 4 replies
giancarlostorotoday at 6:44 PM

> literally go deploy it in a VM on any cloud

Sure, but now you're adding extra cost, vs just running it locally. RAM is also heavily inflated thanks to Sam Altman investment magic.

show 1 reply
HWR_14today at 7:25 PM

Why a cloud provider and not a local VM?

show 1 reply
dheeratoday at 6:46 PM

> We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc

At the same time, let's not let the perfect be the enemy of good.

If you're piloting an aircraft, yeah, you should have perfection.

But if you're sending 34 e-mails and 7 hours of phone calls back and forth to fight a $5500 medical bill that insurance was supposed to pay for, I'd love for an AI bot to represent me. I'd absolutely LOVE for the AI bot to create so much piles of paperwork for these evil medical organizations so that they learn that I will fight, I'm hard to deal with, and pay for my stuff as they're supposed to. Threaten lawyers, file complaints with the state medical board, everything needs to be done. Create a mountain of paperwork for them until they pay that $5500. The next time maybe they'll pay to begin with.

show 3 replies
beepbooptheorytoday at 7:10 PM

What could "human in the loop" be here but just literally reading your own emails?

show 1 reply