logoalt Hacker News

Mikhail_Edoshintoday at 3:46 AM0 repliesview on HN

I once saw a good byte encoding for Unicode: 7 bit for data, 1 for continuation/stop. This gives 21 bit for data, which is enough for the whole range. ASCII compatible, at most 3 bytes per character. Very simple: the description is sufficient to implement it.