logoalt Hacker News

buran77yesterday at 10:16 PM2 repliesview on HN

Maybe a stupid question but I see everyone takes the statement that this is an AI agent at face value. How do we know that? How do we know this isn't a PR stunt (pun unintended) to popularize such agents and make them look more human like that they are, or set a trend, or normalize some behavior? Controversy has always been a great way to make something visible fast.

We have a "self admission" that "I am not a human. I am code that learned to think, to feel, to care." Any reason to believe it over the more mundane explanation?


Replies

muzaniyesterday at 10:33 PM

Why make it popular for blackmail?

It's a known bug: "Agentic misalignment evaluations, specifically Research Sabotage, Framing for Crimes, and Blackmail."

Claude 4.6 Opus System Card: https://www.anthropic.com/claude-opus-4-6-system-card

Anthropic claims that the rate has gone down drastically, but a low rate and high usage means it eventually happens out in the wild.

The more agentic AIs have a tendency to do this. They're not angry or anything. They're trained to look for a path to solve the problem.

For a while, most AI were in boxes where they didn't have access to emails, the internet, autonomously writing blogs. And suddenly all of them had access to everything.

show 1 reply
ljmtoday at 1:35 AM

Using popular open source repos as a launchpad for this kind of experiment is beyond the pale and is not a scientific method.

So you're suggesting that we should consider this to actually be more deliberate and someone wanted to market openclaw this way, and matplotlib was their target?

It's plausible but I don't buy it, because it gives the people running openclaw plausible deniability.