In case the authors are here, the first sentence contains the bytes e2 80 94 which would be UTF-8 fo...

lucb1e • yesterday at 9:24 PM • 3 replies • view on HN

In case the authors are here, the first sentence contains the bytes e2 80 94 which would be UTF-8 for an em dash, but it has been reinterpreted as 3 bytes using https://en.wikipedia.org/wiki/Windows-1252#Code_page_layout and shown on the page as â€”. Further down, there's a lot of similar errors such as a single right quote (U+2019) in K'nex. Firefox seems to have first removed their encoding configuration menu in version 89, then introduced a new button in version 91, and that one is disabled now as well so there's no fixing this user-side it seems :/

Edit: ah the page is from 2012-03-19, from the <meta property="article:published_time"> tag

Replies

rmunn • today at 7:44 AM

I was just mentioning the Japanese word mojibake on the plain-text thread (https://news.ycombinator.com/item?id=47897681), and here you give an example. In fact, UTF-8 misinterpreted as Windows-1252 is the mojibake I personally encounter most often. Curly quotes (most often a right apostrophe inside a word like can't or it's or didn't) are the most common ones, with em dashes being only slightly less common. The other direction (Windows-1252 text being read as UTF-8) produces � (U+FFFD) everywhere instead, but either way, I still see those from time to time today. But far, FAR less frequently than I used to back in the late 2000's or early 2010's. I used to see â€” and similar sequences all the time 15-20 years ago, and now it's rare enough that I actually notice when it happens.

londons_explore • yesterday at 10:13 PM

This is probably the case of a bodged migration from one CMS to another.

My blog suffered the same, and going through loads of old pages to check and fix them just isn't worth the effort.

➕ show 1 reply

taneq • today at 12:01 AM

> Why shouldnâ€™t we be able to?

I have no idea why but my brain immediately interpreted this as a Scottish accent, like ‘shouldnae’. Weird.

➕ show 1 reply

alt Hacker News

Replies