Isn't working with the utf8 stream sufficient? Especially if you only have ASCII keywords/operators/brackets, I feel a ASCII parser should work with utf8 out of the box
Yeah, a parser has no need to understand what a string or glyph is, let alone ASCII or UTF-8. The point is to take a stream of arbitrary data and process it into something that can be reasoned about. Unless you know your input stream is regular in some way, processing it at the finest level of granularity (usually bytes) is probably the only thing to do.
Yeah, a parser has no need to understand what a string or glyph is, let alone ASCII or UTF-8. The point is to take a stream of arbitrary data and process it into something that can be reasoned about. Unless you know your input stream is regular in some way, processing it at the finest level of granularity (usually bytes) is probably the only thing to do.