logoalt Hacker News

nine_kyesterday at 3:40 AM1 replyview on HN

Word boundaries are a complex thing, especially in languages like Chinese or Japanese. Whitespace and punctuation are much less complicated, even if we take the full Unicode case. So the boundary where formatting is considered is between (whitespace | punctuation) and anything else.


Replies

xigoiyesterday at 6:17 AM

So now you have to distribute a character class table in every implementation of your language, which is precisely what the author of Djot wanted to avoid.