I once saw a good byte encoding for Unicode: 7 bit for data, 1 for continuation/stop. This gives 21 bit for data, which is enough for the whole range. ASCII compatible, at most 3 bytes per character. Very simple: the description is sufficient to implement it.