I read online that codepoints are formatted with 4 hex chars for historical reasons. U+41 (Latin A) is formatted as U+0041.