logoalt Hacker News

zmmmmmyesterday at 9:13 PM10 repliesview on HN

I see a big focus on computer use - you can tell they think there is a lot of value there and in truth it may be as big as coding if they convincingly pull it off.

However I am still mystified by the safety aspect. They say the model has greatly improved resistance. But their own safety evaluation says 8% of the time their automated adversarial system was able to one-shot a successful injection takeover even with safeguards in place and extended thinking, and 50% (!!) of the time if given unbounded attempts. That seems wildly unacceptable - this tech is just a non-starter unless I'm misunderstanding this.

[1] https://www-cdn.anthropic.com/78073f739564e986ff3e28522761a7...


Replies

dakolliyesterday at 10:11 PM

Their goal is to monopolize labor for anything that has to do with i/o on a computer, which is way more than SWE. Its simple, this technology literally cannot create new jobs it simply can cause one engineer (or any worker whos job has to do with computer i/o) to do the work of 3, therefore allowing you to replace workers (and overwork the ones you keep). Companies don't need "more work" half the "features"/"products" that companies produce is already just extra. They can get rid of 1/3-2/3s of their labor and make the same amount of money, why wouldn't they.

ZeroHedge on twitter said the following:

"According to the market, AI will disrupt everything... except labor, which magically will be just fine after millions are laid off."

Its also worth noting that if you can create a business with an LLM, so can everyone else. And sadly everyone has the same ideas, everyone ends up working on the same things causing competition to push margins to nothing. There's nothing special about building with LLMs as anyone can just copy you that has access to the same models and basic thought processes.

This is basic economics. If everyone had an oil well on their property that was affordable to operate the price of oil would be more akin to the price of water.

EDIT: Since people are focusing on my water analogy I mean:

If everyone has easy access to the same powerful LLMs that would just drive down the value you can contribute to the economy to next to nothing. For this reason I don't even think powerful and efficient open source models, which is usually the next counter argument people make, are necessarily a good thing. It strips people of the opportunity for social mobility through meritocratic systems. Just like how your water well isn't going to make your rich or allow you to climb a social ladder, because everyone already has water.

show 8 replies
cmiles8yesterday at 11:11 PM

This is the elephant in the room nobody wants to talk about. AI is dead in the water for the supposed mass labor replacement that will happen unless this is fixed.

Summarize some text while I supervise the AI = fine and a useful productivity improvement, but doesn’t replace my job.

Replace me with an AI to make autonomous decisions outside in the wild and liability-ridden chaos ensues. No company in their right mind would do this.

The AI companies are now in a extinctential race to address that glaring issue before they run out of cash, with no clear way to solve the problem.

It’s increasingly looking like the current AI wave will disrupt traditional search and join the spell-checker as a very useful tool for day to day work… but the promised mass labor replacement won’t materialize. Most large companies are already starting to call BS on the AI replacing humans en-mass storyline.

show 2 replies
jstummbilligyesterday at 11:25 PM

It does not seem all that problematic for the most obviously valuable use case: You use an (web) app, that you consider reasonably safe, but that offers no API, and you want to do things with it. The whole adversarial action problem just dissipates, because there is no adversary anywhere in the path.

No random web browsing. Just opening the same app, every day. Login. Read from a calendar or a list. Click a button somewhere when x == true. Super boring stuff. This is an entire class of work that a lot of humans do in a lot of companies today, and there it could be really useful.

show 2 replies
acid__yesterday at 10:59 PM

The 8% and 50% numbers are pretty concerning, but I’d add that was for the “computer use environment” which still seems to be an emerging use case. The coding environment is at a much more reassuring 0.0% (with extended thinking).

general_revealyesterday at 9:36 PM

If the world becomes dependent on computer-use than the AI buildout will be more than validated. That will require all that compute.

show 1 reply
wat10000yesterday at 9:58 PM

It's very simple: prompt injection is a completely unsolved problem. As things currently stand, the only fix is to avoid the lethal trifecta.

Unfortunately, people really, really want to do things involving the lethal trifecta. They want to be able to give a bot control over a computer with the ability to read and send emails on their behalf. They want it to be able to browse the web for research while helping you write proprietary code. But you can't safely do that. So if you're a massively overvalued AI company, what do you do?

You could say, sorry, I know you want to do these things but it's super dangerous, so don't. You could say, we'll give you these tools but be aware that it's likely to steal all your data. But neither of those are attractive options. So instead they just sort of pretend it's not a big deal. Prompt injection? That's OK, we train our models to be resistant to them. 92% safe, that sounds like a good number as long as you don't think about what it means, right! Please give us your money now.

show 2 replies
teaearlgraycoldyesterday at 10:55 PM

People keep talking about automating software engineering and programmers losing their jobs. But I see no reason that career would be one of the first to go. We need more training data on computer use from humans, but I expect data entry and basic business processes to be the first category of office job to take a huge hit from AI. If you really can’t be employed as a software engineer then we’ve already lost most office jobs to AI.

zozbot234yesterday at 9:18 PM

Isn't "computer use" just interaction with a shell-like environment, which is routine for current agents?

show 5 replies
MattGaiseryesterday at 9:20 PM

Does it matter?

"Security" and "performance" have been regular HN buzzwords for why some practice is a problem and the market has consistently shown that it doesn't value those that much.

show 1 reply
bradley13yesterday at 9:17 PM

Does it matter? Really?

I can type awful stuff into a word processor. That's my fault, not the programs.

So if I can trick an LLM into saying awful stuff, whose fault is that? It is also just a tool...

show 6 replies