logoalt Hacker News

vishnuharidasyesterday at 9:41 PM2 repliesview on HN

UTF-8 can represent up to 1,114,112 characters in Unicode. And in Unicode 15.1 (2023, https://www.unicode.org/versions/Unicode15.1.0/) a total of 149,813 characters are included, which covers most of the world's languages, scripts, and emojis. That leaves a 960K space for future expansion.

So, it won't fill up during our lifetime I guess.


Replies

jazayesterday at 11:15 PM

I wouldn't be too quick to jump to that conclusion, we could easily shove another 960k emojis into the spec!

unnouinceputtoday at 7:14 AM

Wait until we get to know another specie then we will not just fill that Unicode space, but we will ditch any utf-16 compatibility so fast that will make your head spin on a snivel.

Imagine the code points we'll need to represent an alien culture :).