logoalt Hacker News

verdvermlast Thursday at 9:35 PM6 repliesview on HN

Why is this interesting?

Is it a shade of gray from HN's new rule yesterday?

https://news.ycombinator.com/item?id=47340079

Personally, the other Ai fail on the front of HN and the US Military killing Iranian school girls are more interesting than someone's poorly harnessed agent not following instructions. These have elements we need to start dealing with yesterday as a society.

https://news.ycombinator.com/item?id=47356968

https://www.nytimes.com/video/world/middleeast/1000000107698...


Replies

acherionlast Thursday at 9:40 PM

I think it's because the LLM asked for permission, was given a "no", and implemented it anyway. The LLM's "justifications" (if you were to consider an LLM having rational thought like a human being, which I don't, hence the quotes) are in plain text to see.

I found the justifications here interesting, at least.

antdkelast Thursday at 9:38 PM

Well, imagine this was controlling a weapon.

“Should I eliminate the target?”

“no”

“Got it! Taking aim and firing now.”

show 4 replies
nielsolelast Thursday at 9:39 PM

Opus being a frontier model and this being a superficial failure of the model. As other comments point out this is more of a harness issue, as the model lays out.

show 1 reply
Swizeclast Thursday at 9:43 PM

Because the operator told the computer not to do something so the computer decided to do it. This is a huge security flaw in these newfangled AI-driven systems.

Imagine if this was a "launch nukes" agent instead of a "write code" agent.

show 1 reply
mmanfrinlast Thursday at 9:43 PM

How is this not clear?

show 1 reply
bakugolast Thursday at 9:53 PM

It's interesting because of the stark contrast against the claims you often see right here on HN about how Opus is literally AGI

show 1 reply