In the plethora of all these articles that explain the process of building projects with LLMs, one thing I never understood it why the authors seem to write the prompts as if talking to a human that cares how good their grammar or syntax is, e.g.:
> I'd like to add email support to this bot. Let's think through how we would do this.
and I'm not not even talking about the usage of "please" or "thanks" (which this particular author doesn't seem to be doing).
Is there any evidence that suggests the models do a better job if I write my prompt like this instead of "wanna add email support, think how to do this"? In my personal experience (mostly with Junie) I haven't seen any advantage of being "polite", for lack of a better word, and I feel like I'm saving on seconds and tokens :)
I think it mattered a lot more a few years ago, when the user's prompts were almost all context the LLM had to go by. A prompt written in a sloppy style would cause the LLM to respond in a sloppy style (since it's a snazzy autocomplete at its core). LLMs reason in tokens, so a sloppy style leads it to mimic the reasoning that it finds in the sloppy writing of its training data, which is worse reasoning.
These days, the user prompt is just a tiny part of the context it has, so it probably matters less or not at all.
I still do it though, much like I try to include relevant technical terminology to try to nudge its search into the right areas of vector space. (Which is the part of the vector space built from more advanced discourse in the training material.)
The reasoning is by being polite the LLM is more likely to stay on a professional path: at its core a LLM try to make your prompt coherent with its training set, and a polite prompt + its answer will score higher (gives better result) than a prompt that is out of place with the answer. I understand to some people it could feel like anthropomorphising and could turn them off but to me it's purely about engineering.
Edit: wording
Because some people like to be polite? Is it this hard to understand? Your hand-written prompts are unlikely to take significant chunk of context window anyway.
My view is that when some "for bots only" type of writing becomes a habit, communication with humans will atrophy. Tokens be damned, but this kind of context switch comes at much too high a cost.
For models that reveal reasoning traces I've seen their inner nature as a word calculator show up as they spend way too many tokens complaining about the typo (and AI code review bots also seem obsessed with typos to the point where in a mid harness a few too many irrelevant typos means the model fixates on them and doesn't catch other errors). I don't know if they've gotten better at that recently but why bother. Plus there's probably something to the model trying to match the user's style (it is auto complete with many extra steps) resulting in sloppier output if you give it a sloppier prompt.
I write "properly" (and I do say "please" and "thank you"), just because I like exercising that muscle. The LLM doesn't care, but I do.
I prompt politely for two reasons: I suspect it makes the model less likely to spiral (but have no hard evidence either way), and I think it's just good to keep up the habit for when I talk to real people.
I just don't want to build the habit of being a sloppy writer, because it will eventually leak into the conversations I have with real humans.
Related to this, has anyone investigated how much typos matter in your chats? I would imagine that typing 'typescfipt' would not be a token in the input training set, so how would the model recognize this as actually meaning 'typescript'? Or does the tokenizer deal with this in an earlier stage?
With current models this isn't as big of a deal, but why risk being an asshole in any context? I don't think treating something like shit simply because it's a machine is a good excuse.
Also consider the insanity of intentionally feeding bullshit into an information engine and expecting good things to come out the other end. The fact that they often perform well despite the ugliness is a miracle, but I wouldn't depend on it.
Just stream of consciousness into the context window works wonders for me. More important to provide the model good context for your question
There is evidence of that, but more importantly, it wouldn't occur to me to write "wanna add email support". That's not my natural voice.
Some people are just polite by nature & habits are hard to break
I suspect they just find it easier and more natural to write with proper grammar.
one reason to do that could be it’s trained on conversations happened between humans.
I choose to talk in a respectful way, because that's how I want to communicate: it's not because I'm afraid of retaliation or burning bridges. It's because I am caring and conscious. If I think that something doesn't have feelings or long-term memory, whether it's AI or a piece of rock on the side of a trail, it in no way leads me to be abusive to it.
Further, an LLM being inherently sycophantic leads to it mimmicking me, so if I talk to it in a stupid or abusive (which is just another form of stupidity, in my eyes) manner, it will behave stupid. Or, that's what I'd expect. I've not researched this in a focused way, but I've seen examples where people get LLMs to be very unintelligent by prompting riddles or intelligence tests in highly-stylized speech. I wanted to say "highly-stupid speech", but "stylized" is probably more accurate, e.g.: `YOOOO CHATGEEEPEEETEEE!!!!!!1111 wasup I gots to asks you DIS.......`. Maybe someone can prove me wrong.
agree, prompting a token predictor like you’re talking to a person is counterproductive and I too wish it would stop
the models consistently spew slop when one does it, I have no idea where positive reinforcement for that behavior is coming from
I can't speak for everyone, but to me the most accurate answer is that I'm role-playing, because it just flows better.
In the back of my head I know the chatbot is trained on conversations and I want it to reflect a professional and clear tone.
But I usually keep it more simple in most cases. Your example:
> I'd like to add email support to this bot. Let's think through how we would do this.
I would likely write as:
> if i wanted to add email support, how would you go about it
or
> concise steps/plan to add email support, kiss
But when I'm in a brainstorm/search/rubber-duck mode, then I write more as if it was a real conversation.