logoalt Hacker News

hk__2yesterday at 7:41 PM1 replyview on HN

What do you mean? What would you suggest instead? Fixed-length encoding? It would take a looot of space given all the character variations you can have.


Replies

gertopyesterday at 7:57 PM

UTF-16 is both simpler to parse and more compact than utf-8 when writing non-english characters.

UTF-8 didn't win on technical merits, it won becausw it was mostly backwards compatible with all American software that previously used ASCII only.

When you leave the anglosphere you'll find that some languages still default to other encodings due to how large utf-8 ends up for them (Chinese and Japanese, to name two).

show 8 replies