An LLM company using regexes for sentiment analysis? That's like a truck company using h...

BoppreH • today at 10:59 AM • 36 replies • view on HN

An LLM company using regexes for sentiment analysis? That's like a truck company using horses to transport parts. Weird choice.

Replies

lopsotronic • today at 2:29 PM

The difference in response time - especially versus a regex running locally - is really difficult to express to someone who hasn't made much use of LLM calls in their natural language projects.

Someone said 10,000x slower, but that's off - in my experience - by about four orders of magnitude. And that's average, it gets much worse.

Now personally I would have maybe made a call through a "traditional" ML widget (scikit, numpy, spaCy, fastText, sentence-transformer, etc) but - for me anyway - that whole entire stack is Python. Transpiling all that to TS might be a maintenance burden I don't particularly feel like taking on. And on client facing code I'm not really sure it's even possible.

➕ show 4 replies

stingraycharles • today at 11:05 AM

Because they want it to be executed quickly and cheaply without blocking the workflow? Doesn’t seem very weird to me at all.

➕ show 4 replies

nojs • today at 3:37 PM

Oh it’s worse than that. This one ended up getting my account banned: https://github.com/anthropics/claude-code/issues/22284

➕ show 3 replies

blks • today at 11:56 AM

Because they actually want it to work 100% of the time and cost nothing.

➕ show 2 replies

codegladiator • today at 11:15 AM

what you are suggesting would be like a truck company using trucks to move things within the truck

➕ show 1 reply

floralhangnail • today at 1:02 PM

Well, regex doesn't hallucinate....right?

➕ show 2 replies

draxil • today at 11:17 AM

Good to have more than a hammer in your toolbox!

artrockalter • today at 8:39 PM

LLMs are good at writing complex regex, from my experience

nitekode • today at 4:22 PM

A lot if things dont make sense until you involve scale. Regex could be good enough do give a general gist.

arnarbi • today at 6:17 PM

It's more like workers on a large oil tanker using bicycles to move around it, rather than trying to use another oil tanker.

ldobre • today at 4:33 PM

It's more like a truck company using people to transport some parts. I could be wrong here, but I bet this happens in Volvo's fabrics a lot.

raw_anon_1111 • today at 3:48 PM

Cloud hosted call centers using LLMs is one of my specialties. While I use an LLM for more nuanced sentiment analysis, I definitely use a list of keywords as a first level filter.

makeitrain • today at 3:12 PM

Don’t worry, they used an llm to generate the regex.

__alexs • today at 1:42 PM

Using some ML to derive a sentiment regex seems like a good actually?

irthomasthomas • today at 2:13 PM

This just proves its vibe coded because LLMs love writing solutions like that. I probably have a hundred examples just like it in my history.

➕ show 1 reply

harikb • today at 2:10 PM

Not everything done by claude-code is decided by LLM. They need the wrapper to be deterministic (or one-time generated) code?

apgwoz • today at 3:01 PM

> That's like a truck company using horses to transport parts. Weird choice.

Easy way to claim more “horse power.”

pdntspa • today at 3:50 PM

LLMs cost money, regular expressions are free. It really isn't so strange.

throwaw12 • today at 12:27 PM

because impact of WTF might be lost in the result of the analysis if you solely rely on LLM.

parsing WTF with regex also signifies the impact and reduces the noise in metrics

"determinism > non-determinism" when you are analysing the sentiment, why not make some things more deterministic.

Cool thing about this solution, is that you can evaluate LLM sentiment accuracy against regex based approach and analyse discrepancies

ojr • today at 11:46 AM

I used regexes in a similar way but my implementation was vibecoded, hmmm, using your analysis Claude Code writes code by hand.

mghackerlady • today at 1:08 PM

More like a car company transporting their shipments by truck. It's more efficient

pfortuny • today at 12:22 PM

They had the problem of sentiment analysis. They use regexes.

You know the drill.

scotty79 • today at 7:01 PM

As far as I can tell they do nothing with it. They just log it.

kjshsh123 • today at 12:30 PM

Using regex with LLMs isn't uncommon at all.

feketegy • today at 2:05 PM

It's all regex anyways

lazysheepherd • today at 1:50 PM

Because they are engineers? The difference between an engineer and a hobbyist is an engineer has to optimize the cost.

As they say: any idiot can build a bridge that stands, only an engineer can build a bridge that barely stands.

intended • today at 2:15 PM

The amount of trust and safety work that depends on google translate and the humble regex, beggars the imagination.

j45 • today at 2:13 PM

Asking a non deterministic software to act like a deterministic one (regex) can be a significantly higher use of tokens/compute for no benefit.

Some things will be much better with inference, others won’t be.

sumtechguy • today at 11:56 AM

hmm not a terrible idea (I think).

You have a semi expensive process. But you want to keep particular known context out. So a quick and dirty search just in front of the expensive process. So instead of 'figure sentiment (20seconds)'. You have 'quick check sentiment (<1sec)' then do the 'figure sentiment v2 (5seconds)'. Now if it is just pure regex then your analogy would hold up just fine.

I could see me totally making a design choice like that.

lou1306 • today at 11:02 AM

They're searching for multiple substrings in a single pass, regexes are the optimal solution for that.

➕ show 2 replies

make3 • today at 3:28 PM

it's like a faster than light spaceship company using horses. There's been infinite solutions to do this better even CPU only for years lol.

sfn42 • today at 2:37 PM

It's almost as if LLMs are unreliable

susupro1 • today at 1:53 PM

[dead]

alt Hacker News

Replies