They have an interesting regex for detecting negative sentiment in users prompt which is then logged (explicit content): https://github.com/chatgptprojects/claude-code/blob/642c7f94...
I guess these words are to be avoided...
George Carlin would be very pleased. They missed quite a few of the heavy seven though.
We used this in 2011 at the startup I worked for. 20 positive and 20 negative words was good enough to sell Twitter "sentiment analysis" to companies like Apple, Bentley, etc...
I don't know about avoided, this kind of represents the WTF per minute code quality measurement. When I write WTF as a response to Claude, I would actually love if an Antrhopic engineer would take a look at what mess Claude has created.
Everyone is commenting how this regex is actually a master optimization move by Anthropic
When in reality this is just what their LLM coding agent came up with when some engineer told it to "log user frustration"
If this code is real and complete then there are no callers of those methods other than a logger line
Nice, "wtaf" doesn't match so I think I'm out of the dog house when the clanker hits AGI (probably).
I was thinking the opposite. Using those words might be the best way to provide feedback that actually gets considered.
I've been wondering if all of these companies have some system for flagging upset responses. Those cases seem like they are far more likely than average to point to weaknesses in the model and/or potentially dangerous situations.
They also have a "keep going" keyword, literally just "continue" or "keep going", just for logging.
I've been using "resume" this whole time
I guess using French words is safe for now.
That's undoubtedly to detect frustration signals, a useful metric/signal for UX. The UI equivalent is the user shaking their mouse around or clicking really fast.
I'm clearly way too polite to Claude.
Also:
// Match "continue" only if it's the entire prompt
if (lowerInput === 'continue') {
return true
}
When it runs into an error, I sometimes tell it "Continue", but sometimes I give it some extra information. Or I put a period behind it. That clearly doesn't give the same behaviour.Curiously "clanker" is not on the list
That looks a bit bare minimum, not the use of regex but rather that it's a single line with a few dozen words. You'd think they'd have a more comprehensive list somewhere and assemble or iterate the regex checks as needed.
everyone here is commenting how odd it looks to use a regexp for sentiment analysis, but it depends what they're trying to do.
It could be used as a feedback when they do A/B test and they can compare which version of the model is getting more insult than the other. It doesn't matter if the list is exhaustive or even sane, what matters is how you compare it to the other.
Perfect? no. Good and cheap indicator? maybe.
oh I hope they really are paying attention. Even though I'm 100% aware that claude is a clanker, sometimes it just exhibits the most bizarre behavior that it triggers my lizard brain to react to it. That experience troubles me so much that I've mostly stopped using claude code. Claude won't even semi-reliably follow its own policies, sometimes even immediately after you confirm it knows about them.
Interesting that expletives and words that are more benign like "frustrating" are all classified the same.
There is no „stupid” I often write „(this is stupid|are you stupid) fix this”.
And Claude was having in chain of though „user is frustrated” and I wrote to it I am not frustrated just testing prompt optimization where acting like one is frustrated should yield better results.
Probably a lot of my prompts have been logged then. I’ve used wtf so many times I’ve lost track. But I guess Claude hasn’t
Glad abusing words in my list are not in that. but its surprising that they use regex for sentiments.
Hmm.. I flag things as 'broken' often and I've been asked to rate my sessions almost daily. Now I see why.
OMG WTF
Surely "so frustrating" isn't explicit content?
If anyone at anthropic is reading this and wants more logs from me add jfc.
so they think that everybody on earth swears only in english?
you'd better be careful wth your typos, as well
i dislike LLMs going down that road, i don't want to be punished for being mean to the clanker
> terrible
I know I used this word two days ago when I went through three rounds of an agent telling me that it fixed three things without actually changing them.
I think starting a new session and telling it that the previous agent's work / state was terrible (so explain what happened) is pretty unremarkable. It's certainly not saying "fuck you". I think this is a little silly.
Yeah, this is crazy
so frustrating..
[dead]
[dead]
i wish that's for their logging/alert. i definitely gauge model's performance by how much those words i type when i'm frustrated in driving claude code.
Ridiculous string comparisons on long chains of logic are a hallmark of vibe-coding.
An LLM company using regexes for sentiment analysis? That's like a truck company using horses to transport parts. Weird choice.