logoalt Hacker News

armchairhackertoday at 12:31 PM2 repliesview on HN

It seems to me LL and LR parser generators are overrated, and hand-written recursive descent is best in practice. I understand why academics teach them, but not why some spend so long on different parsing techniques, nor why hobbyists who just want to compile their toy language are directed to them.

I work in PL, and from my first compiler to today, have always found recursive descent easiest, most effective (less bugs, better error diagnostics, fast enough), and intuitive. Many popular language compilers use recursive descent: I know at least C# (Roslyn) and Rust, but I believe most except Haskell (GHC) and ocaml.

The LR algorithm was simple once I learned it, and yacc-like LR (and antlr-like LL) parser generators were straightforward once I learned how to resolve conflicts. But recursive descent (at least to me) is simpler and more straightforward.

LR being more expressive than LL has never mattered. A hand-written recursive descent parser is most expressive: it has unlimited lookahead, and can modify parsed AST nodes (e.g. reordering for precedence, converting if into if-else).

The only solution that comes close is tree-sitter, because it implements GLR, provides helpful conflict messages, and provides basic IDE support (e.g. syntax highlighting) almost for free. But it’s a build dependency, while recursive descent parsers can be written in most languages with zero dependencies and minimal boilerplate.


Replies

wglbtoday at 2:27 PM

> It seems to me LL and LR parser generators are overrated, and hand-written recursive descent is best in practice

I would now agree with that. My compiler experience was on a team that happened to have a LALR expert, Jeanne Musinski PhD, a student of Jeffrey Ullman. She invented a better error recovery for the language. Recursive descent would have been perfectly suited to the task.

> LR being more expressive than LL has never mattered.

Quite agree. One might guess that a language that needs that might be hard to program in.

_old_dude_today at 1:02 PM

Parser generators are great in Python (Lark for me) so you can iterate fast and get a runtime spec of your grammar.

A hand-written recursive descent parser is something you do later when you start to industrialize your code, to get better error messages, make the parser incremental, etc.

Bison/ANTLR are code generators, they do not fit well in that model.