The eating disorder section is kind of crazy. Are we going to incrementally add sections for every 'bad' human behaviour as time goes on?
That part of the system prompt is just stating that telling someone who has an actual eating disorder to start counting calories or micro-manage their eating in other ways (a suggestion that the model might well give to an average person for the sake of clear argument, which would then be understood sensibly and taken with a grain of salt) is likely to make them worse off, not better off. This seems like a common-sense addition. It should not trigger any excess refusals on its own.
Another way to think about it: every single user of Claude is paying an extra tax in every single request
When you are worth hundreds of billions, people start falling over themselves running to file lawsuits against you. We're already seeing this happen.
So spending $50M to fund a team to weed out "food for crazies" becomes a no-brainer.
Are the prompts used both by the desktop app, like typical chatbot interfaces, and Claude Code?
Because it's a waste of my money to check whether my Object Pascal compiler doesn't develop eating disorders, on every turn.
It feels like half of AI research is math, and the other half is coming up with yet another way to state "please don't do bad things" in the prompt that will sure work this time I promise.
The alignment favors supporting healthy behaviors so it can be a thin line. I see the system prompt as "plan B" when they can't achieve good results in the training itself.
It's a particularly sensitive issue so they are just probably being cautious.
Starting to feel like a "we were promised flying cars but all we got" kind of moment
Could be that Claude has particular controversial opinions on eating disorders.
>the year is 2028 >5M of your 10M context window is the system prompt
Yup. Anyone who is surprised by this has not been paying attention to the centralization of power on the internet in the past 10 years.
I mean, that's what humans have always done with our morals, ethics, and laws, so what alternative improvement do you have to make here?
Imagine the kind of human that never adapts their moral standpoints. Ever. They believe what they believed when they were 12 years old.
Letting the system improve over time is fine. System prompt is an inefficient place to do it, buts it's just a patch until the model can be updated.
Even better, adding it to the system prompt is a temporary fix, then they'll work it into post-training, so next model release will probably remove it from the system prompt. At least when it's in the system prompt we get some visibility into what's being censored, once it's in the model it'll be a lot harder to understand why "How many calories does 100g of Pasta have?" only returns "Sorry, I cannot divulge that information".