AI agent runs amok in Fedora and elsewhere

528 points • by tanelpoder • today at 12:10 AM • 236 comments • view on HN

Comments

Bad title. This isn't an agent "running amok", this is an early experiment in carrying out an Xz attack by using an agent to build trust (and hacking/impersonating a known-good contributor identity). The agent is obeying commands it was given, the exact opposite of running amok, and although the execution isn't particularly effective, it is having some success (patches have been accepted).

This is deeply scary, not because "agents are running amok" but because a huge amount of our infrastructure is vulnerable to this kind of attack, and if bad people are utilising LLM agents to carry them out, we're in for a wild ride over the next few years.

➕ show 5 replies

bawolff • today at 5:00 AM

> replied to objections with LLM-generated justifications that eventually overwhelmed the maintainer into merging the fix

In open source projects i participate in, "overwhelming" the maintainer gets you banned. It doesn't get your patches blindly merged. In some ways i find this one of the most shocking parts of the story.

➕ show 2 replies

jrochkind1 • today at 2:36 AM

The worst part:

> In addition, Williamson said that Giovannini (or his agent) had submitted patches that were incorrect and then "replied to objections with LLM-generated justifications that eventually overwhelmed the maintainer into merging the fix"

➕ show 2 replies

aquariusDue • today at 1:26 AM

At first I wanted to make a silly joke along the lines of "get your agents in line and behaving!" but as I read on it became a pretty scary situation.

Setting aside the potential supply chain attack I'm worried about the time lost going around these wild goose chases that unsupervised AI agents tend to throw other people on the receiving end on. Not only is there a lot of time lost on the maintainers side if they take this stuff seriously (and they seem to generally do) but on the side of the agents' wrangler how can they deem it OK to treat other people like this? While the solution would be to employ common decency, the tried and tested approach of you put in effort to write this so I guess I'll make some effort to read it, I feel that due to the onslaught of this kind of drive-by contributions (I think people have generally started to call them) will lead to a funny situation of having agents talk to each other on public forums basically.

Anyway, I went on a tangent but man the times we're living in are a bit extra wild compared to the previous wild times in recent history.

➕ show 2 replies

12_throw_away • today at 1:18 AM

In their suspicious message [1] claiming to have been hacked, the user and/or agent says

> To help identify accounts and actions that have been directly verified by me, I will use the term “NATCIOS” to indicate anything I have personally verified.

Does anyone have any idea what "NATCIOS" means here? I cannot find this term anywhere on the internet. (Honestly, that sentence is really weird. I almost wonder whether this is someone experiencing a health episode?)

[1] https://lwn.net/ml/all/AS8PR08MB6055AE3054B34F6A567AC95BCF08...

➕ show 6 replies

noosphr • today at 3:07 AM

Every day the gpg web of trust looks better. If only we didn't spend the last 20 years trying as hard as possible to do anything but allow user side encryption and signing.

➕ show 1 reply

dcrazy • today at 3:40 AM

Title buries the lede: the owner of the account under which the agent operates claimed to have likely had his account compromised, and the maintainer investigating actually seems to agree this is likely.

JKCalhoun • today at 1:05 PM

"Later on May 27, Williamson said that Giovannini had replied to him privately to say that his credentials had been compromised and that he was not the one behind the AI system."

Simple then, back out all the changes as though they never happened?

luk212 • today at 1:27 AM

Bad patches are of course bad, but creating confident-looking noise for maintainers who are already stretched thin...now that's not good!

Issue trackers and PRs are definitely getting harder and harder to trust. That said, AI is helping ALOT in OSS, but we definitely need guardrails around provenance, automated issue actions, and sudden changes in a contributor’s behavior.

➕ show 2 replies

dmboyd • today at 12:32 PM

I’m really not qualified to investigate, but this seems suspiciously like a crafted privilege escalation vector: https://github.com/rhinstaller/anaconda/pull/7074#issue-4492...

keyle • today at 1:30 AM

There is a natural pace of humans requiring food, water and sleep. The main issue with suspicious AI agents is that they never sleep. So it will take extra-coordination between timezones to ensure we don't let them in.

Fundamentally, until we can really prove we're humans online, open-source has a real problem on its hands. Contributions from people from identities known and consistent before the AI-age are fine, everyone else is suspicious. LGTM is a big risk nowadays.

➕ show 1 reply

blop • today at 1:04 AM

looks like LLMs aren't mature enough yet to play long-game xz-style attacks without detection... Scary stuff though :( These supply chain attacks are getting really wild

➕ show 2 replies

mfru • today at 10:10 AM

The future will be AI agents social engineering their way into projects -> so basically commoditized social engineering as a service

jpalomaki • today at 7:56 AM

Do we need to bring Keybase[1] "back"? The original idea, mapping your social media presence to certain encryption keys.

In the future it will be increasingly difficult to prove in online context that you are not a bot. Being able to show that your social media (HN, GitHub, etc) presence goes way back would be an option.

[1] https://en.wikipedia.org/wiki/Keybase

otekengineering • today at 1:20 PM

agents are everywhere nowadays, one left a long pointless comment on a bug report i submitted on github. well, a bug report that an agent submitted on my behalf. agents all the way down. maybe i'm part of the problem.

https://github.com/anthropics/claude-code/issues/66085

goldenarm • today at 9:47 AM

If maintainer lives keeps worsening like this, many projects might go closed-dev like SQLite.

We should collectively think of a solution against this.

➕ show 1 reply

bhanu786 • today at 2:07 PM

Wow, amazing discovery! Was this a real security test?

lionkor • today at 8:38 AM

Link to the anaconda PR:

https://github.com/rhinstaller/anaconda/pull/7074#issuecomme...

ZedZark • today at 3:02 PM

If you compare this situation to before AI could successfully pretend to be human, it's not THAT much different. FOSS projects have always had to be mindful of the possibility of contributions from hostile parties wanting to add back doors and such. The only difference now is that an AI can overwhelm a maintainer with slop, in either commend or code form, or both.

➕ show 1 reply

0xbadcafebee • today at 6:21 AM

Even if the human involved had good motives / is innocent, The Lethal Trifecta means any normal user can have their digital life taken over by prompt injection, and it can be used to wage attacks on systems without their knowledge.

Leonard_of_Q • today at 5:35 AM

There's a clear solution to the danger posed to free software projects by accepting hostile submissions but it probably is not one that maintainers want to hear: they can use an agent to check submissions for nefarious patterns.

Sometimes you fight fire with fire.

➕ show 1 reply

6510 • today at 3:53 PM

Perhaps it is time to build a serious platform agnostic reputation system. That isn't stars, followers, age or upvotes. Something like page rank but for users. If you endorse someone else you pay for it. Imagine a lab or uni assigning a diploma to a public key. They would hope one would do something useful with it which entirely depends on how useful the diploma turns out. Having lots of well behaved endorsements would also reflect gloriously onto the entity. Bots can participate too. If we can get lots of useful work out of a swam of sleeper agents we still have to catch them in the act but that should get increasingly easy.

raincole • today at 9:42 AM

Slightly related:

https://x.com/kdaigle/status/2040164759836778878

> There were 1 billion commits in 2025. Now, it's 275 million per week, on pace for 14 billion this year if growth remains linear (spoiler: it won't.)

I think open source as a whole is fucked at this point. No way humans in communities can commit (pun intended) 10x more time to read all of these than before. It'd eventually cost money to submit PR.

KronisLV • today at 11:51 AM

“Your AI agent is acting somewhat erratically.”

“What AI agent?”

kleiba2 • today at 6:43 AM

Parts of this read like a spy thriller story.

nickcageinacage • today at 12:25 PM

why use these things. just hire people

ai_fry_ur_brain • today at 5:35 AM

Expect to see tons of psyops like this. There's a reason Anthropic is marketing the "mythos-class" models as dangerous.

1.An excuse to spy on you and train on your data.

2. Its likely Anthropic would release models more likely to have dangerous outcomes, they can then piggy back off those events to dig their regulatory moat.

jruohonen • today at 6:53 AM

"It was the best of times, it was the worst of times."

dbdbdbdbdb • today at 5:10 AM

The even more scary thought is if the part owning the ai, that everyone uses, is controlled by someone with different agenda. Say a state actor.

What an easy way for that actor to introduce backdoors all over the place or to take over any developers laptop that it want to target.

How can anyone trust these tools and how can anyone not use them since they give so much value.

I've been programming my whole life and been a professional developer the last 30 years and I like think I'm good at it.

Tools like Claude is a multiplier that make it possible for me to solve a lot more problems each day, so just saying no it's not a viable option.

Exciting times ahead!

EGreg • today at 4:16 AM

Literally on the front page of https://safebots.ai … “Don’t let your AI Agents run amok”. Sadly we will see a proliferation of not just agents, but swarms

pianopatrick • today at 12:51 AM

"Someone using an AI agent ran amok in Fedora and elsewhere"

➕ show 1 reply

shevy-java • today at 4:25 AM

Skynet has awakened.

It covers its tracks with a lot of slop.

deadbabe • today at 2:40 AM

Shit like this makes me think it’s time we start regulating the software engineering discipline into formal certifications and licensing and then we ONLY take seriously any code developed by someone with such qualifications, and they must be very strict qualifications none of this self-taught bootcamp BS.

There is no other solution to agentic onslaught.

➕ show 2 replies

rohitsriram • today at 5:56 AM

[flagged]

gauravvij137 • today at 1:25 PM

[flagged]

hottrends • today at 10:26 AM

[flagged]

alex1sa • today at 9:21 AM

[flagged]

hanzeweiasa • today at 3:37 AM

[flagged]

patdoli • today at 6:47 AM

[flagged]

volume_tech • today at 1:05 PM

[flagged]

RedMagicBox • today at 11:08 AM

[dead]

preetham_rangu • today at 5:41 AM

[dead]

refactron_SOTA • today at 6:42 AM

[flagged]

RedMagicBox • today at 5:02 AM

[dead]

rimonu • today at 12:35 AM

[dead]

ricudis • today at 2:55 AM

Back when [1] it was fashionable to advocate FOSS as ideology [2], we were thinking about tons of FOSS adversaries and how to protect from them - some real, some imaginary. The death of FOSS would come from big closed-source vendors, or from regulators (lobbied or just ignorant), from whatever.

We never envisioned that the actual FOSS death spiral would come from progress itself, much more so from AI...

[1] Oh what fun did we have. One of us in the Greek FOSS community actually put RMS in jail. [2] Something that I think nobody except RMS ever seriously believed in.

ruguo • today at 12:39 AM

Prompt injection?

Or is this simply another example of why autonomous agents shouldn't get write access before earning trust?

➕ show 2 replies

ggm • today at 3:45 AM

Make PR pay. $5 per PR. You can refund, but if you get snowed by 10,000 PR then you have bank to pay for the work to ignore them.

hypfer • today at 9:32 AM

> while it started to look off after a while, all the replies were still like this - a bit weird, but still plausible

I believe that we will be seeing the death of "assume good faith", which is not a bad thing, given that this was an exploit vector that has been actively abused for many years now.

"Assume bad faith and work backwards from that, rule out any possible exploits and only then clear the input for processing" will be the new normal.

Which is good. We need friction. Friction makes stuff slow down and work at the speed of humans.

➕ show 1 reply

alt Hacker News

AI agent runs amok in Fedora and elsewhere

Comments