Semantic ablation: Why AI writing is generic and boring

204 points • by benji8000 • today at 4:12 PM • 168 comments • view on HN

Comments

This is a good statement of what I suspect many of us have found when rejecting the rewriting advice of AIs. The "pointiness" of prose gets worn away, until it doesn't say much. Everything is softened. The distinctiveness of the human voice is converted into blandness. The AI even says its preferred rephrasing is "polished" - a term which specifically means the jaggedness has been removed.

But it's the jagged edges, the unorthodox and surprising prickly bits, that tear open a hole in the inattention of your reader, that actually gets your ideas into their heads.

➕ show 7 replies

SignalStackDev • today at 6:02 PM

Something I noticed building multi-agent pipelines: the ablation compounds. Had a 4-step pipeline - summarize, expand, review, refine - and by step 3 everything had the same rhythm and vocabulary. Anchoring the original source text explicitly at each step helped, but only partially.

The more interesting cause I think: RLHF is the primary driver, not just the architecture. Fine-tuning is trained on human preference ratings where "clear," "safe," and "inoffensive" consistently win pairwise comparisons. That creates a training signal that literally penalizes distinctiveness - a model that says something surprising loses to one that says something expected. Successful RLHF concentrates probability mass toward the median preferred output, basically by definition.

Base models - before fine-tuning - are genuinely weirder. More likely to use unusual phrasing, make unexpected associative leaps, break register mid-paragraph. Semantic ablation isn't a side effect of the training process, it's the intended outcome of the objective.

Which makes the fix hard: you can't really prompt your way out of it once a model is heavily tuned. Temperature helps a little but the distribution is already skewed. Where we've gotten better results is routing "preserve the voice" tasks to less-tuned models, and saving the heavily RLHF'd models for structured extraction and classification where blandness is actually what you want.

➕ show 2 replies

stephc_int13 • today at 4:49 PM

The "AI voice" is everywhere now.

I see it on recent blog posts, on news articles, obituaries, YT channels. Sometimes mixed with voice impersonation of famous physicists like Feynman or Susskind.

I find it genuinely soul-crushing and even depressing, but I may be over sensitive to it as most readers don't seem to notice.

➕ show 5 replies

delis-thumbs-7e • today at 5:05 PM

I personally think “generative AI” is a misnomer. More I understand the mathematics behind machine learning more I am convinced that it should not be used to generate text, images or anything that is meant for people to consume, even if it is the most blandest of email. Sometimes you might get lucky, but most of the time you only get what the most boring person in the most boring cocktail party would say if forced to be creative with a gun pointed to his head. It can help in multitude of other ways, help human in the creative process itself, but generating anything even mildly creative by itself… I’ll pass.

➕ show 4 replies

tasty_freeze • today at 4:53 PM

Bible Scholar and youtube guy Dan McClellan had an amazing "high entropy" phrase that slayed me a few days ago.

https://youtu.be/605MhQdS7NE?si=IKMNuSU1c1uaVCDB&t=730

He ended a critical commentary by suggesting that the author he was responding to should think more critically about the topic rather than repeating falsehoods because "they set off the tuning fork in the loins of your own dogmatism."

Yeah, AI could not come up with that phrase.

➕ show 2 replies

rorylaitila • today at 4:51 PM

Yes I noticed this as well. I was last writing up a landing page for our new studio. Emotion filled. Telling a story. I sent it through grok to improve it. It removed all of the character despite whatever prompt I gave. I'm not a great writer, but I think those rough edges are necessary to convey the soul of the concept. I think AI writing is better used for ideation and "what have I missed?" and then write out the changes yourself.

➕ show 2 replies

causal • today at 5:53 PM

YES this hits the nail on something I've been trying to express for some time now. Semantic ablation: love it, going to use that a lot not now when arguing why someone's ChatGPT-washed email sucks.

Semantic ablation is also why I'm doubtful of everyone proclaiming that Opus 4 would be AGI if we just gave it the right agent harness and let all the agents run free on the web. In reality they would distill it to a meaningless homogeneous stew.

➕ show 1 reply

Arifcodes • today at 9:01 PM

The core mechanic described here is real. RLHF does optimize toward the mean, that is just what happens when you train on human preference ratings and raters consistently reward clear, inoffensive, "polished" output.

But the damage is not uniform. For code comments, API docs, commit messages: low-entropy output is often fine. The problem is people using LLMs for things that require a distinct voice and then wondering why the result sounds like everyone else on the internet.

The part nobody talks about: you can partially fight this if you know what you lost. Prompts like "preserve unusual word choices" or "do not normalize my rhetorical structure" help, but only if you have a strong enough baseline to catch the drift. Most people using AI for writing assistance do not have that baseline, which is why the ablation goes undetected. They see polished output and ship it.

➕ show 1 reply

sieste • today at 9:24 PM

All these forced metaphors and clumsy linguistic flourishes made me cringe. Just add some typos and grammar mistakes like the rest of us to prove that your human.

crabmusket • today at 9:24 PM

For those who haven't seen it yet, this Wiki page also has what I think is very good advice about writing:

https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing

While the page's purpose is to help editors detect AI contributions, you can also detect yourself doing these same things sometimes, and fix them.

morgengold • today at 5:38 PM

I wonder how much of it could be prompted away.

For example the anthropic Frontend Design skill instructs:

"Typography: Choose fonts that are beautiful, unique, and interesting. Avoid generic fonts like Arial and Inter; opt instead for distinctive choices that elevate the frontend's aesthetics; unexpected, characterful font choices. Pair a distinctive display font with a refined body font."

"NEVER use generic AI-generated aesthetics like overused font families (Inter, Roboto, Arial, system fonts), cliched color schemes (particularly purple gradients on white backgrounds), predictable layouts and component patterns, and cookie-cutter design that lacks context-specific character." 1

Maybe sth similar would be possible for writing nuances.

1 https://github.com/anthropics/skills/blob/main/skills/fronte...

➕ show 2 replies

lakhotiaharshit • today at 8:54 PM

This article on AI writing being boring seems to be written by AI. The em dashes and the sentence structure, all seems to be AI output. Or have human started adopting this style too.

Espressosaurus • today at 4:44 PM

This matches what I saw when I tried using AI as an editor for writing.

It wanted to replace all the little bits of me that were in there.

co_king_5 • today at 5:26 PM

The original title of the article is: "Why AI writing is so generic, boring, and dangerous"

Why was the title of of the link on HackerNews updated to remove the term "Dangerous"?

The term was in the link on HackerNews for the first hour or so that this post was live.

➕ show 1 reply

doomslayer999 • today at 8:44 PM

Great article and exactly why I use AI less and less. I basically find it to be rotting my brain towards the middle of the distribution. It's like all the nuance and critical thinking that actually goes into things gets stripped out.

Once a company perfects an agent that essentially performs condensed search and coding boilerplate making, that is probably where LLMs end for me. Perplexity and Claude are on the right track but not at all close.

conartist6 • today at 4:36 PM

Race to the middle really sums up how I feel about AI.

➕ show 3 replies

ranprieur • today at 5:08 PM

This isn't new to AI. The same kind of thing happens in movie test screenings, or with autotune. If something is intended for a large audience, there's always an incentive to remove the weird stuff.

tpoacher • today at 7:20 PM

The part about a change in entropy was interesting.

Is there an easy way to get / compare the entropy of two passages? (e.g. to see if it has indeed dropped after gen ai manipulation).

And could this be used to flag AI-gen text (or at least, boring, soulless sounding text)

➕ show 1 reply

notepad0x90 • today at 6:30 PM

Isn't this more to do with how LLMs are trained for general purpose use? Are LLMs with a specific use and dataset in mind better? Like if the dataset was fiction novels, would it sound more booky? If it was social-media, would it sound more click-baity and engaging?

I've had AI be boring, but I've also seen things like original jokes that were legitimately funny. Maybe it's the prompts people use, it doesn't give it enough of a semantic and dialectic direction to not be generic. IRL, we look at a person and get a feel for them and the situation to determine those things.

ZoomZoomZoom • today at 9:04 PM

What a weird use of "Romanesque" and "Baroque". Doesn't compute for me at all.

resiros • today at 4:46 PM

I wonder why AI labs have not worked on improving the quality of the text outputs. Is this as the author claims a property of the LLMs themselves? Or is there simply not much incentive to create the best writing LLM?

➕ show 4 replies

ux266478 • today at 5:57 PM

> The AI identifies unconventional metaphors or visceral imagery as "noise" because they deviate from the training set's mean.

That's certainly a take. In the translation industry (the primogenitor and driver for much of the architecture and theory of LLMs) they're known for making extremely unconventional choices to such a degree that it actively degrades the quality of translation.

andai • today at 4:45 PM

Could we invert a sign somewhere and get the opposite effect?

(Obviously a different question from "is an AI lab willing to release that publicly” ;)

➕ show 1 reply

aleph_minus_one • today at 5:08 PM

Couldn't you simply increase the temperature of the model to somewhat mitigate this effect?

➕ show 3 replies

simonw • today at 4:45 PM

I'd like to see some concrete examples that illustrate this - as it stands this feels like an opinion piece that doesn't attempt to back up its claims.

(Not necessarily disagreeing with those claims, but I'd like to see a more robust exploration of them.)

➕ show 5 replies

esafak • today at 5:02 PM

I think they can fix all that but they can't fix the fact that the computer has no intention to communicate. They could imbue it with agency to fix that too, but I much prefer it the way things are.

reilly3000 • today at 4:46 PM

Those transformations happen to mirror what happens to human intelligence when you take antipsychotics. Please know the risks before taking them. They are innumerable and generally irreversible.

AreShoesFeet000 • today at 5:33 PM

How much money would it take for me to take an open weight model, treat it nice, and go have some fun? Maybe some thousands, right?

josefritzishere • today at 4:51 PM

As a writer who has been published many times and edited many other writers for publication... It seems like AI can't make stylistic determinations. It is generally good with spelling and grammar but the text it generates is very homogeneous across formats. It's readable but it's not good, and always full of fluff like an online recepie harvesting clicks. It's kind of crap really. If you just need filler it's ok, but if you want something pleasand you definitely still need a human.

somewhereoutth • today at 4:50 PM

> What began as a jagged, precise Romanesque structure of stone is eroded into a polished, Baroque plastic shell

Not to detract from the overall message, but I think the author doesn't really understand Romanesque and Baroque.

(as an aside, I'd most likely associate Post-Modernism as an architectural style with the output of LLMs - bland, regurgitative, and somewhat incongruous)

nalllar • today at 6:05 PM

The article itself reads as an AI generated output, complete with classic Not Just X … Y hallmarks from forever ago, 100% on pangram's low false positive detector. I'm not sure if it's some experiment on their readerbase or what. pangram result: https://www.pangram.com/history/02bead1c-c36e-461b-8fa7-8699...

So many AI generated AI bashing articles lately. I wrote a post complaining about running into these, and asking people who've sent me these AI articles multiple of them came from HN. https://lunnova.dev/articles/ai-bashing-ai-slop/

book_mike • today at 4:52 PM

Sematic ablation... that's some technobable.

➕ show 1 reply

vessenes • today at 5:09 PM

Meh. Semantic Ablation - but toward a directed goal. If I say "How would Hemingway have said this, provided he had the same mindset he did post-war while writing for Collier's?"

Then the model will look for clusters that don't fit what the model consider's to be Hemingway/Colliers/Post-War and suggest in that fashion.

"edit this" -> blah

"imagine Tom Wolfe took a bunch of cocaine and was getting paid by the word to publish this after his first night with Aline Bernstein" -> probably less blah

➕ show 1 reply

kirykl • today at 6:21 PM

co_king_5 • today at 4:54 PM

> Semantic ablation is the algorithmic erosion of high-entropy information. Technically, it is not a "bug" but a structural byproduct of greedy decoding and RLHF (reinforcement learning from human feedback).

> Domain-specific jargon and high-precision technical terms are sacrificed for "accessibility." The model performs a statistical substitution, replacing a 1-of-10,000 token with a 1-of-100 synonym, effectively diluting the semantic density and specific gravity of the argument.

> The logical flow – originally built on complex, non-linear reasoning – is forced into a predictable, low-perplexity template. Subtext and nuance are ablated to ensure the output satisfies a "standardized" readability score, leaving behind a syntactically perfect but intellectually void shell.

What a fantastic description of the mechanisms by LLMs erase and distort intelligence!

I agree that AI writing is generic, boring and dangerous. Further, I only think someone could feel this way if they don't have a genuine appreciation for writing.

I feel strongly that LLMs are positioned as an anti-literate technology, currently weaponized by imbeciles who have not and will never know the joy of language, and who intend to extinguish that joy for any of those around them who can still perceive it.

➕ show 1 reply

swyx • today at 5:33 PM

the word choice here is so obtuse as to trigger my radar for "is this some kind of parody where this itself was AI generated". it appears to be entirely serious, which is disappointing, it could have been high art.

the words TFA is looking for is mode collapse https://www.lesswrong.com/posts/t9svvNPNmFf5Qa3TA/mysteries-... and the author could herself learn to write more clearly.

➕ show 2 replies

52-6F-62 • today at 5:54 PM

Because you simply can't engineer creativity. Maybe you can describe where it comes from, in a circuitous, abstract way with mathematics (and ultimately run face first into ħ and then run in circles for eternity). But to engineer it, you'd have to start over from the first principles of the stuff of the cosmos. One's a map and the other the territory.

spwa4 • today at 5:10 PM

As someone longtime involved in software development, can we call this "best practices" instead of some like "semantic ablation" that nobody understands?

➕ show 1 reply

lyu07282 • today at 4:50 PM

> The model performs a statistical substitution, replacing a 1-of-10,000 token with a 1-of-100 synonym

Do we see this in programming too? I don't think so? Unique, rarely used API methods aren't substituted the same way when refactoring. Perhaps that could give us a clue on how to fix that?

➕ show 1 reply

lurquer • today at 4:56 PM

Nonsense. I’ve written bland prose for a story and AI made it much better by revising it with a prompt such as this: “Make the vocabulary and grammar more sophisticated and add in interesting metaphors. Rewrite it in the style of a successful literary author.”

Etc.

➕ show 2 replies

black_13 • today at 4:59 PM

[dead]

alt Hacker News

Semantic ablation: Why AI writing is generic and boring

Comments