logoalt Hacker News

mrkeentoday at 11:50 AM1 replyview on HN

I'll push back and say that the lexer/parser split is well worth it.

And the best thing about the parser combinator approach is that each is just a kind of parser, something like

  type Lexer = ParsecT e ByteString m [Token]

  type Parser = ParsecT e [Token] Expr
All the usual helper functions like many or sepBy work equally well in the lexing and parsing phases.

It really beats getting to the parentheses-interacting-with-ordering-of-division-operations stage and still having to think "have I already trimmed off the whitespace here or not?"


Replies

omcnoetoday at 9:48 PM

I am writing a whitespace sensitive parser - trimming whitespace matters because whitespace consumption is used to implement the indentation rules/constraints.

For example, doing things like passing an indentation sensitive whitespace consumer to a parser inside `many` for consuming all of an indented child block. If I split lexing/parsing I think I'd have to do things like insert indentation tokens into the stream, and end up with the same indentation logic (but instead matching on those indentation tokens) in the parser regardless.

I have found that order-of-operations is somewhat trivially solved by `makeExprParser` from `Control.Monad.Combinators.Expr`.