logoalt Hacker News

deepsunyesterday at 7:43 PM1 replyview on HN

That's assuming the text is not corrupted or maliciously modified. There were (are) _numerous_ vulnerabilities due to parsing/escaping of invalid UTF-8 sequences.

Quick googling (not all of them are on-topic tho):

https://www.rapid7.com/blog/post/2025/02/13/cve-2025-1094-po...

https://www.cve.org/CVERecord/SearchResults?query=utf-8


Replies

s1mplicissimusyesterday at 9:09 PM

I was just wondering a similar thing: If 10 implies start of character, doesn't that require 10 to never occur inside the other bits of a character?

show 3 replies