I work on research studying LLM writing styles, so I am going to have to steal this. I've seen ...

capnrefsmmat • last Saturday at 11:46 PM • 7 replies • view on HN

I work on research studying LLM writing styles, so I am going to have to steal this. I've seen plenty of lists of LLM style features, but this is the first one I noticed that mentions "tapestry", which we found is GPT-4o's second-most-overused word (after "camaraderie", for some reason).[1] We used a set of grammatical features in our initial style comparisons (like present participles, which GPT-4o loved so much that they were a pretty accurate classifier on their own), but it shouldn't be too hard to pattern-match some of these other features and quantify them.

If anyone who works on LLMs is reading, a question: When we've tried base models (no instruction tuning/RLHF, just text completion), they show far fewer stylistic anomalies like this. So it's not that the training data is weird. It's something in instruction-tuning that's doing it. Do you ask the human raters to evaluate style? Is there a rubric? Why is the instruction tuning pushing such a noticeable style shift?

[1] https://www.pnas.org/doi/10.1073/pnas.2422455122, preprint at https://arxiv.org/abs/2410.16107. Working on extending this to more recent models and other grammatical features now

Replies

orbital-decay • yesterday at 10:21 AM

I have nothing to contribute but speculation based on my intuition, but IMO RLHF (or rather human preference modeling in general, including the post-training dataset formatting) is a relatively small factor in this, RL-induced mode collapse is much bigger one. Take a look at the original DeepSeek R1 Zero, the point of which was to train a model with very little human preference, because they've been on a budget and human preference doesn't scale. It's pretty unhinged in its writing, like the base model, but unlike the base model it converges onto stable writing patterns, and the output diversity is as non-existent as in models with carefully engineered "personalities" like Claude. Ask it to name a random city and look at the logits, and you'll still see a pretty narrow distribution. At the same time some models with RLHF (e.g. the old RedPajama) have more diverse outputs.

Collapsed mode makes the models truncate entire token trajectories, repeat themselves, and indirectly it does something MUCH deeper, they converge on almost 1:1 input-to-output concept mapping (instead of one-to-many, like in base models). Same lack of variety can be seen in diffusion models, GANs, VAEs and any other model regardless of the type and receiving human preference.

Moreover, these patterns are generational. Old ones get replaced with new ones, and the list in the OP is going to be obsolete in a year. This is what already happened to previous models several times, from what I can tell. Supposedly this is because they scrape the web polluted by previous gen models.

➕ show 1 reply

djoldman • yesterday at 12:09 AM

The RLHF is what creates these anomalies. See delve from kenya and nigeria.

Interestingly, because perplexity is the optimization objective, the pretrained models should reflect the least surprising outputs of all.

➕ show 2 replies

kristianp • today at 8:53 AM

> It's something in instruction-tuning that's doing it.

Isn't the instruction tuning done with huge amounts of synthetic data? I wonder if the lack of diversity comes from llm generated data used for instruction tuning.

networked • yesterday at 12:06 AM

You may be interested in my links on AI's writing style: https://dbohdan.com/ai-writing-style. I've just added your preprint and tropes.fyi. It has "hydrogen jukeboxes: on the crammed poetics of 'creative writing' LLMs" by nostalgebraist (https://www.tumblr.com/nostalgebraist/778041178124926976/hyd...), which features an example with "tapestry".

> Why is the instruction tuning pushing such a noticeable style shift?

Gwern Branwen has been covering this: https://gwern.net/doc/reinforcement-learning/preference-lear....

➕ show 1 reply

grey-area • yesterday at 9:10 AM

I wonder if th style shift has anything to do with training for conversation (ie. tuning models to respond well in a chat situation)?

➕ show 1 reply

red_hare • yesterday at 6:47 AM

I wonder if it has to do with how meaning is tied to the tokens. c+amara+derie (using the official gpt-5 tokenizer).

There's also just that weird thing where they're obsessed with emoji which I've always assumed is because they're the only logograms in english and therefore have a lot of weight per byte.

➕ show 1 reply

albert_e • yesterday at 2:42 AM

There is an organization named Tapestry (parent of Coach Inc).

Wonder how they can avoid the trop while not censoring themselves out.

alt Hacker News

Replies