logoalt Hacker News

mananaysiempreyesterday at 1:24 PM1 replyview on HN

Worth mentioning (haven’t checked if the paper talks about this) that while the industry mostly forgot about derivatives and extended REs (i.e. REs with intersection and negation), academia did not. Unfortunately, there have been some pretty discouraging results: the DFA for an extended RE (including a lazy DFA implemented using derivatives, as here) is worst-case doubly exponential in the length of the expression[1], not just exponential as for normal REs. So there is a potential reason not to support intersections in one’s RE matcher, even if they are enticingly easy to implement in terms of derivatives (and even if I personally like to see experimentation in this direction).

[1] https://www.sciencedirect.com/science/article/pii/S030439751...


Replies

someplaceguyyesterday at 3:43 PM

> the DFA for an extended RE (including a lazy DFA implemented using derivatives, as here) is worst-case doubly exponential in the length of the expression

The authors seem to claim linear complexity:

> the result is RE#, the first general-purpose regex engine to support intersection and complement with linear-time guarantees, and also the overall fastest regex engine on a large set of benchmarks

show 3 replies