logoalt Hacker News

rmunntoday at 7:45 AM3 repliesview on HN

Haven't watched the videos yet, but from the slides, it looks like part of the issue he was talking about was encodings (there's a slide illustrating UTF-16LE ve UTF-16BE, for example). Thankfully, with UTF-8 becoming the default everywhere (so that you need a really good reason not to use it for any given document), we're back at "yes, there is such a thing as plain text" again. It has a much larger set of valid characters, but if you receive a text file without knowing its encoding, you can just assume it's UTF-8 and have a 99.7% chance of being right.

FINALLY.


Replies

bmitctoday at 1:14 PM

The point is, a lot of work went into making that happen. I.e., plain text as it is today is not some inherent property of computing. It is a binary protocol and displaying text through fonts is also not a trivial matter.

So my question is: what are we leaving on the table by over focusing on text? What about graphs and visual elements?

show 1 reply
ButlerianJihadtoday at 12:48 PM

vaxocentrism, or “All the World’s a VAX”

http://www.catb.org/esr/jargon/html/V/vaxocentrism.html

thaumasiotestoday at 10:21 AM

> Thankfully, with UTF-8 becoming the default everywhere (so that you need a really good reason not to use it for any given document), we're back at "yes, there is such a thing as plain text" again.

Whenever I hear this, I hear "all text files should be 50% larger for no reason".

UTF-8 is pretty similar to the old code page system.

show 1 reply