> The issue is that you shouldn't be looking for substrings in the first place.
Why? They clearly just want to log conversations that are likely to display extreme user frustration with minimal overhead. They could do a full-blown NLP-driven sentiment analysis on every prompt but I reckon it would not be as cost-effective as this.
>Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.
The only time to use a regex is when searching with a human in the loop. All other uses are better handled some other way.
>They could do a full-blown NLP-driven sentiment analysis on every prompt but I reckon it would not be as cost-effective as this.
Every conversation is sent to an llm at least a thousand times the size of gpt2 which could one shot this nearly a decade ago.