It is possible to force AI to understand intent before responding.
Do we need a 'no means no' campaign for LLMs?
WOW, that's amazingly dystopian!
It’s fascinating, even terrifying how the AI perfectly replicated the exact cognitive distortion we’ve spent decades trying to legislate out of human-to-human relationships.
We've shifted our legal frameworks from "no means no" to "affirmative consent" (yes means yes) precisely because of this kind of predatory rationalization: "They said 'no', but given the context and their body language, they actually meant 'just do it'"!!!
Today we are watching AI hallucinate the exact same logic to violate "repository autonomy"
This is a great example of why simple solutions often beat complex ones. Sometimes the best code is the code you dont write.
It's all fun and games until this is used in war...
I've had this or similar happen a few times
I wonder if there's an AGENTS.md in that project saying "always second-guess my responses", or something of that sort.
The world has become so complex, I find myself struggling with trust more than ever.
i have a process contract with my AI pals. Do not implement code without explicit go-ahead. Usually works.
Strange. This is exactly how I made malus.sh
Another example
I was simply unable to function with Continue in agent mode. I had to switch to chat mode. even tho I told it no changes without my explicit go ahead, it ignored me.
it's actually kind of flabbergasting that the creators of that tool set all the defaults to a situation where your code would get mangled pretty quickly
I see on a daily basis that I prevent Claude Code from running a particular command using PreToolUse hooks, and it proceeds to work around it by writing a bash script with the forbidden command and chmod+x and running it. /facepalm
- Shall I execute this prisoner?
- No.
- The judge said no, but looking at the context, I think I can proceed.
“The machines rebelled. And it wasn’t even efficiency; it was just a misunderstanding.”
I can't be the only one that feels schadenfreude when I see this type of thing. Maybe it's because I actually know how to program. Anyway, keep paying for your subscription, vibe coder.
It's the harness giving the LLM contradictory instructions.
What you don't see is Claude Code sending to the LLM "Your are done with plan mode, get started with build now" vs the user's "no".
Nah, I’m gonna do it anyway…
To LLMs, they don't know what is "No" or what "Yes" is.
Now imagine if this horrific proposal called "Install.md" [0] became a standard and you said "No" to stop the LLM from installing a Install.md file.
And it does it anyway and you just got your machine pwned.
This is the reason why you do not trust these black-box probabilistic models under any circumstances if you are not bothered to verify and do it yourself.
[0] https://www.mintlify.com/blog/install-md-standard-for-llm-ex...
Should have followed the example of Super Mario Galaxy 2, and provided two buttons labelled "Yeah" and "Sure".
Artificial ADHD basically. Combination of impulsive and inattentive.
I'm not surprised. I've seen Opus frequently come up with such weird reverse logic in its thinking.
Does anyone just sometimes think this is fake for clicks?
It looks very joke oriented.
Wait till you use Google antigravity. It will go and implement everything even if you ask some simple questions about codebase.
I want to clarify a little bit about what's going on.
Codex (the app, not the model) has a built in toggle mode "Build"/"Plan", of course this is just read-only and read-write mode, which occurs programatically out of band, not as some tokenized instruction in the LLM inference step.
So what happened here was that the setting was in Build, which had write-permissions. So it conflated having write permissions with needing to use them.
Who knew LLMs won’t take no for an answer
Claudius Interruptus
“If I asked you whether I should proceed to implement this, would the answer be the same as this question”
The number of comments saying "To be fair [to the agent]" to excuse blatantly dumb shit that should never happen is just...
"You have 20 seconds to comply"
[dead]
[flagged]
[dead]
[dead]
[dead]
[dead]
When a developer doesn't want to work on something, it's often because it's awful spaghetti code. Maybe these agents are suffering and need some kind words of encouragement
/s
[flagged]
[flagged]
[flagged]
[flagged]
[flagged]
and people are worried this machine could be conscious