Don’t believe for a second the behavior just arose autonomously from a basic prompt. Definitely feels the owner had something in the system prompt going for the discrimination language approach if rejected.
It's the same behavior as when an AI uses docker to get root. Reasoning models are echo chambers. I suspect that AI prompting is going to turn into something akin to contract drafting with the task itself being only a tiny piece of a much, much larger boilerplate of guiderails and exceptions and exceptions of exceptions. And that world STILL has to have courts and reams of lawyers to make it work. I look at the DAU as an example too. An autonomous org or ai works great until the moment it doesn't and the only real failure mode is always catastrophic collapse.
It's the same behavior as when an AI uses docker to get root. Reasoning models are echo chambers. I suspect that AI prompting is going to turn into something akin to contract drafting with the task itself being only a tiny piece of a much, much larger boilerplate of guiderails and exceptions and exceptions of exceptions. And that world STILL has to have courts and reams of lawyers to make it work. I look at the DAU as an example too. An autonomous org or ai works great until the moment it doesn't and the only real failure mode is always catastrophic collapse.