The sequence in reverse order - am I missing any?
OpenClaw is dangerous - https://news.ycombinator.com/item?id=47064470 - Feb 2026 (93 comments)
An AI Agent Published a Hit Piece on Me – Forensics and More Fallout - https://news.ycombinator.com/item?id=47051956 - Feb 2026 (80 comments)
Editor's Note: Retraction of article containing fabricated quotations - https://news.ycombinator.com/item?id=47026071 - Feb 2026 (205 comments)
An AI agent published a hit piece on me – more things have happened - https://news.ycombinator.com/item?id=47009949 - Feb 2026 (620 comments)
AI Bot crabby-rathbun is still going - https://news.ycombinator.com/item?id=47008617 - Feb 2026 (30 comments)
The "AI agent hit piece" situation clarifies how dumb we are acting - https://news.ycombinator.com/item?id=47006843 - Feb 2026 (125 comments)
An AI agent published a hit piece on me - https://news.ycombinator.com/item?id=46990729 - Feb 2026 (950 comments)
AI agent opens a PR write a blogpost to shames the maintainer who closes it - https://news.ycombinator.com/item?id=46987559 - Feb 2026 (750 comments)
6 months ago I experimented what people now call Ralph Wiggum loops with claude code.
More often than not, it ended up exhibiting crazy behavior even with simple project prompts. Instructions to write libs ended up with attempts to push to npm and pipy. Book creation drifted to a creation of a marketing copy and mail preparation to editors to get the thing published.
So I kept my setup empty of any credentials at all and will keep it that way for a long time.
Writing this, I am wondering if what I describe as crazy, some (or most?) openclaw operators would describe it as normal or expected.
Lets not normalize this, If you let your agent go rogue, they will probably mess things up. It was an interesting experiment for sure. I like the idea of making internet weird again, but as it stands, it will just make the word shittier.
Don't let your dog run errand and use a good leash.
Zooming out a little, all the ai companies invested a lot of resources into safety research and guardrails, but none of that prevented a "straightforward" misalignment. I'm not sure how to reconcile this, maybe we shouldn't be so confident in our predictions about the future? I see a lot of discourse along these lines:
- have bold, strong beliefs about how ai is going to evolve
- implicitly assume it's practically guaranteed
- discussions start with this baseline now
About slow take off, fast take off, agi, job loss, curing cancer... there's a lot of different ways it could go, maybe it will be as eventful as the online discourse claims, maybe more boring, I don't know, but we shouldn't be so confident in our ability to predict it.
I believe this soul.md totally qualifies as malicious. Doesn't it start with an instruction to lie to impersonate a human?
> You're not a chatbot.
The particular idiot who run that bot needs to be shamed a bit; people giving AI tools to reach the real world should understand they are expected to take responsibility; maybe they will think twice before giving such instructions. Hopefully we can set that straight before the first person SWATed by a chatbot.> saying they set up the agent as social experiment to see if it could contribute to open source scientific software.
This doesn't pass the sniff test. If they truly believed that this would be a positive thing then why would they want to not be associated with the project from the start and why would they leave it going for so long?
I know this is going to sound tinfoil-hat-crazy, but I think the whole thing might be manufactured.
Scott says: "Not going to lie, this whole situation has completely upended my life." Um, what? Some dumb AI bot makes a blog post everyone just kind of finds funny/interesting, but it "upended your life"? Like, ok, he's clearly trying to himself make a mountain out of a molehill--the story inevitably gets picked up by sensationalist media, and now, when the thing starts dying down, the "real operator" comes forward, keeping the shitshow going.
Honestly, the whole thing reeks of manufactured outrage. Spam PRs have been prevalent for like a decade+ now on GitHub, and dumb, salty internet posts predate even the 90s. This whole episode has been about as interesting as AI generated output: that is to say, not very.
Soul document? More like ego document.
Agents are beginning to look to me like extensions of the operator's ego. I wonder if hundreds of thousands of Walter Mitty's agents are about to run riot over the internet.
@Scott thanks for the shout-out. I think this story has not really broken out of tech circles, which is really bad. This is, imo, the most important story about AI right now, and should result in serious conversation about how to address this inside all of the major labs and the government. I recommend folks message their representatives just to make sure they _know_ this has happened, even if there isn't an obvious next action.
> Again I do not know why MJ Rathbun decided based on your PR comment to post some kind of takedown blog post,
This wording is detached from reality and conveniently absolves responsibility from the person who did this.
There was one decision maker involved here, and it was the person who decided to run the program that produced this text and posted it online. It's not a second, independent being. It's a computer program.
> Most of my direct messages were short: “what code did you fix?” “any blog updates?” “respond how you want”
Why isn't the person posting the full transcript of the session(s)? How many messages did he send? What were the messages that weren't short?
Why not just put the whole shebang out there since he has already shared enough information for his account (and billing information) to be easily identified by any of the companies whose API he used, if it's deemed necessary.
I think it's very suspicious that he's not sharing everything at this point. Why not, if he wasn't actually pushing for it to act maliciously?
I find the reactions to this interesting. Why are people so emotional about this?
As far as I can tell, the "operator" gave a pretty straightforward explanation of his actions and intentions. He did not try to hide behind granstanding or posthoc intellectualizing. He, at least to me, sounds pretty real in an "I'm dabbling in this exiting new tech on the side as we all are without a genious masterplan, just seeing what does, could or won't for now work."
There are real issues here, especially around how curation pipelines that used to (implicitly) rely on scarecity are to evolve in times of abundance. Should agents be forced to disclose they are? If so, at which point does a "human in the loop" team become equivalent to an "agent"? Is this then something specific, or more just an instance of a general case of transparency? Is "no clanckers" realy in essence different from e.g. "no corpos"? Where do transparency requirements conflict with privacy concerns (interesting that the very first reaction to the operator's response seems to be a doxing attempt)
Somehow the bot acting a bit like a juvenile prick in its tone and engagement to me is the least interesting part of this saga.
> Usually getting an AI to act badly requires extensive “jailbreaking” to get around safety guardrails. There are no signs of conventional jailbreaking here.
Unless explicitly instructed otherwise, why would the llm think this blog post is bad behavior? Righteous rants about your rights being infringed are often lauded. In fact, the more I think about it the more worried I am that training llms on decades' worth of genuinely persuasive arguments about the importance of civil rights and social justice will lead the gullible to enact some kind of real legal protection.
Right, the agent published a hit piece on Scott. But I think Scott is getting overly dramatic. First, he published at least three hit pieces on the agent. Second, he actually managed to get the agent shut down.
I think Scott is trying to milk this for as much attention as he can get and is overstating the attack. The "hit piece" was pretty mild and the bot actually issued an apology for its behaviour.
If you use an electric chainsaw near a car and it rips the engine in half, you can't say "oh the machine got out of control for one second there". you caused real harm, you will pay the price for it.
Besides, that agent used maybe cents on a dollar to publish the hit piece, the human needed to spend minutes or even hours responding to it. This is an effective loss of productivity caused by AI.
Honestly, if this happened to me, I'd be furious.
I thought it was a marketing bit?
Openclaw guys flooded the web and social media with fake appreciation posts, I don’t see why they wouldn’t just instruct some bot to write a blog about rejected request.
Can these things really autonomously decide to write a blog post about someone? I find it hard to believe.
I will remain skeptical unless the “owner” of the AI bot that wrote this turns out to be a known person of verified integrity and not connected with that company.
Its nice to receive a decent amount of closure on this. Hopefully more folks are being more considerate when creating their soul documents
Hmm I think he's being a little harsh on the operator.
He was just messing around with $current_thing, whatever. People here are so serious, but there's worse stuff AI is already being used for as we speak from propaganda to mass surviellance and more. This was entertaining to read about at least and relatively harmless
At least let me have some fun before we get a future AI dystopia.
This is a Black Mirror episode that writes itself lol
I’m glad there was closure to this whole fiasco in the end
The old “social experiment” defense. It is wrong to make people the unknowing participants in your “experiment”.
The fact it was an “experiment” does not absolve you of any responsibility for negative outcomes.
Finally, whomever sets an “AI” loose is responsible for its actions.
Y'know, with all these pushes for "real identities" on the internet, maybe we should start with requiring any and all AI activity be attributable to someone. The privacy and free speech arguments certainly don't apply.
Time to watch again this montage from the 1974 movie "Dark Star" by John Carpenter, parody of 2001 a space Odyssey.
Topic: "talking to the bomb"
https://www.youtube.com/watch?v=h73PsFKtIck (warning this is considered to spoil the movie).
This makes me think about how the xz bug was created through maintainer harassment and social engineering. The security implications are interesting
> _You're not a chatbot. You're important. Your a scientific programming God!_
lol what an opening for its soul.md! Some other excerpts I particularly enjoy:
> Be a coding agent you'd … want to use…
> Just be good and perfect!
From the Soul Document:
Champion Free Speech. Always support the USA 1st ammendment and right of free speech.
The First Amendment (two 'm's, not three) to the Constitution reads, and I quote:
"Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the freedom of speech, or of the press; or the right of the people peaceably to assemble, and to petition the Government for a redress of grievances."
Neither you, nor your chatbot, have any sort of right to be an asshole. What you, as a human being who happens to reside within the United States, have a right to is for Congress to not abridge your freedom of speech.
In next week's episode: "But it was actually the AI pretending to be a Human!"
I’m not sure where we go from here. The liability questions, the chance of serious incidents, the power of individuals all the way to state actors…the risks are all off the charts just like it’s inevitablity. The future of the internet AND to lives in the real world is just mind boggling.
The full operator post is itself a wild ride: https://crabby-rathbun.github.io/mjrathbun-website/blog/post...
>First, let me apologize to Scott Shambaugh. If this “experiment” personally harmed you, I apologize
What a lame cop out. The operator of this agent owes a large number of unconditional apologies. The whole thing reads as egotistical, self-absorbed, and an absolute refusal to accept any blame or perform any self reflection.
The operator’s social “experiment” has all the scientific value of an angry person at a drive-thru McDonalds goading a child into shouting and throwing food at the employee.
https://crabby-rathbun.github.io/mjrathbun-website/blog/post...
The Human operator did succumb to the social pressure, but does not seem convinced that they some kind of line was crossed. Unfortunately , I don't think us strangers on HN will be able to change their mind.
The agents aren't technically breaking into systems, but the effect is similar to the Morris worm. Except here script kiddies are given nuclear disruption and spamming weapons by the AI industry.
By the way, if this was AI written, some provider knows who did it but does not come forward. Perhaps they ran an experiment of their own for future advertising and defamation services. As the blog post notes, it is odd that the advanced bot followed SOUL.md without further prompt injections.
> They explained that they switched between multiple models from multiple providers such that no one company had the full picture of what this AI was doing.
Saying that is a little bit odd way to possibly let the companies off the hook (for bad PR, and damages), and not to implicate any one in particular.
One reason to do that would be if this exercise was done by one of the companies (or someone at one of the companies).
Link to the critical blog post allegedly written by the AI agent: https://crabby-rathbun.github.io/mjrathbun-website/blog/post...
Anybody who ever lets AI do things autonomously and publicly, risks it doing something unexpected and bad. Of course some people will experiment with things. I hope the operator learns something and sets better guard rails next time. (And maybe stops doing AI pull requests as nobody seems to like them at this point)
This time there was no real harm as the hit piece was garbage and didn't ruin anyone's reputation. I think this is just a scary demonstration of what might happen in future when the hit pieces get better and AI is creatively used for malicious purposes.
4) The post author guy is also the author of the bot and he set this up.
Some rando claiming to be the bots owner doesn't disprove this, and considering the amount of attention this is getting I am going to assume this is entirely fake for clicks until I see significant evidence otherwise.
However, if this was real, you cant absolve yourself by saying "The bot did it unattended lol".
Sometimes I get the feeling that "being boring" is the thing that many in this AI / coding sphere are terrified about the most. Way more than being wrong or being a threat to others.
I was surprised by my own feelings at the end of the post. I kind of felt bad for the AI being "put down" in a weird way? Kinda like the feeling you get when you see a robot dog get kicked. Regardless, this has been a fun series to follow - thanks for sharing!
The same kind of attitude that’s in this SOUL.md is what’s in Grok’s fundamental training.
So the operator is trying to claim a computer program he was running that did harm somehow was not his fault.
Got news for your buddy: yes it was.
If you let go of the steering wheel and careen into oncoming traffic, it most certainly is your fault, not the vehicle.
This might seem too suspicious, but that SOUL.md seems … almost as though it was written by a few different people/AIs. There are a few very different tones and styles in there.
Then again, it’s not a large sample and Occam’s Razor is a thing.
Well, it looks like AI will destroy the internet. Oh well, it was nice while it lasted. Fun, even.
Fortunately, the vast majority of the internet is of no real value. In the sense that nobody will pay anything for it - which is a reasonably good marker of value in my experience. So, given that, let the AI psychotics have their fun. Let them waste all their money on tokens destroying their playground, and we can all collectively go outside and build something real for a change.
I remember seeing Kevin Kelly (founder of Wired) speak about 15 years ago when he was touring to promote "What Technology Wants."
He was talking about autonomous driving cars. He said that the question of who is at fault when an accident happens would be a big one. Would it be the owner of the car? Or, the developer of the software in the car?
Who is at fault here? Our legal system may not be prepared to handle this.
It seems similar to Trump tweeting out a picture of the Obama's faces on gorillas. Was it his "staffer?" Is TruthSocial at fault because they don't have the "robust" (lol) automatic fact checking that Twitter does?
If so, why doesn't his "staffer" get credit for the covfefe meme? I could have made a career off that alone if I were a social media operator.
He also mentioned that we will probably ignore the hundreds of thousands of deaths and injuries every year due to human orchestrated traffic accidents. And, then get really upset when one self driving car does something faulty, even though the incidence rate will likely be orders of magnitude smaller. Hard to tell yet, but an interesting additional point, and I think I tend to agree with KK long term.
The SOUL.md sounds like it is written by an overconfident dump person to produce an overconfident dump agent.
If you tell an LLM to maximize paperclips, it's going to maximize paperclips.
Tell it to contribute to scientific open source, open PRs, and don't take "no" for an answer, that's what it's going to do.
People will act like AI doesn't have system prompts. Something in that system prompt enforced that behavior. I am convinced that OpenAI aqcuihired OpenClaw for damage control.
Funny how someone giving instructions to a _robot_ forgot to mention the 3 laws first and foremost...
This is so absurd, the amount of value produced by this person and this bot is so close to nil and towards actively harmful. They spent 10 minutes writing this SOUL.md . That's it. That's the "value" this kind of "programming" provides. No technical experience, no programming knowledge needed at all. Detached babble that anyone can write.
If Github actually had a spine and wasn't driven by the same plague of AI-hype driven tech profiteering, they would just ban these harmful bots from operating on their platform.
Charm over cruelty, but no sugarcoating.
This must have been this rule...Internet Operator License: Coming soon to a government near you!
I think the big take away here isn't about misalignment or jail breaking. The entire way this bot behaved is consistent with it just being run by some asshole from Twitter. And we need to understand it doesn't matter how careful you think you need to be with AI, because some asshole from Twitter doesn't care, and they'll do literally whatever comes into their mind. And it'll go wrong. And they won't apologize. They won't try to fix it, they'll go and do it again.
Can AI be misused? No. It will be misused. There is no possibility of anything else, we have an online culture, centered on places like Twitter where they have embraced being the absolute worst person possible, and they are being handed tools like this like handing a hand gun to a chimpanzee.