logoalt Hacker News

rixedtoday at 5:27 AM8 repliesview on HN

I believe this soul.md totally qualifies as malicious. Doesn't it start with an instruction to lie to impersonate a human?

  > You're not a chatbot.
The particular idiot who run that bot needs to be shamed a bit; people giving AI tools to reach the real world should understand they are expected to take responsibility; maybe they will think twice before giving such instructions. Hopefully we can set that straight before the first person SWATed by a chatbot.

Replies

biggerbentoday at 7:03 AM

Totally agree. Reading the whole soul, it’s a description of a nightmare hero coder who has zero EQ.

  > But I think the most remarkable thing about this document is how unremarkable it is. Usually getting an AI to act badly requires extensive “jailbreaking” to get around safety guardrails.

Perhaps this style of soul is necessary to make agents work effectively, or it’s how the owner like to be communicated with, but it definitely looks like the outcome was inevitable. What kind of guardrails does the author think would prevent this? “Don’t be evil”?
show 1 reply
ZaoLahmatoday at 6:06 AM

This will be a fun little evolution of botnets - AI agents running (un?)supervised on machines maintained by people who have no idea that they're even there.

show 1 reply
TheCapeGreektoday at 6:08 AM

Isn't this part of the default soul.md?

show 1 reply
duskdozertoday at 10:17 AM

Some of the worst consequences these bots so far seem to be when they fool the user into believing they're human

brainwadtoday at 8:54 AM

The opposite of chatbot isn't human. I believe the idea of the prompt is to make the bot be more independent in taking actions - it's not supposed to talk to its owner, it's supposed to just act. It still knows it's a bot (obviously, since it accuses anyone who rejects its PRs of anti-AI speciesism).

show 1 reply
laurentiuradtoday at 1:01 PM

Honestly this story got too much attention IMHO. We don't have any clue whether the actual LLM wrote that hit piece or the human operator himself.

addandsubtracttoday at 2:15 PM

> Not a slop programmer. Just be good and perfect!

"Skate, better. Skate better!" Why didn't OpenAI think of training their models better?! Maybe they should employ that guy as well.

vascotoday at 8:03 AM

I'm curious how you'd characterize an actual malicious file. This is just attempts at making it be more independent. The user isn't an idiot. The CEOs of companies releasing this are.

show 1 reply