logoalt Hacker News

the_arunlast Friday at 3:10 PM2 repliesview on HN

Just curious - if we remove stop words from prompts before going to LLM, wouldn't it reduce token size? Will it keep the response from LLM same (original vs without stop tokens)?


Replies

kylecazarlast Friday at 3:55 PM

Search engines can afford to throw out stopwords because they're often keyword based. But (frontier) LLM's need the nuance and semantics they signal -- they don't automatically strip them. There are probably special purpose models that do this, or in certain parts of a RAG pipeline, but that's the exception.

Yeah, it'll be less input tokens if you omitted them yourself. It's not guaranteed to keep the response the same, though. You're asking the model to work with less context and more ambiguity at that point. So stripping your prompt of stopwords is going to save you negligible $ and potentially cost a lot in model performance.

cubefoxlast Friday at 7:41 PM

Don't know, but GPT-5 Thinking strips out a lot of words in its reasoning trace in order to save tokens. Someone on Twitter jailbroke it in order to get the original CoT traces.