Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have this book. This book is good, and very complete from a traditional formals grammars point-of-view (LL, LR, SLR, LALR), and has very good sections on top-down and bottom-up parsing, but it is rather outdated in terms of what people are using now.

Nowadays many people use parser combinator libraries or packrat-style PEG parsers. This book provides no information about either.



Parser combinator libraries are usually just embedded DSLs for constructing data structures (implicitly or explicitly) that are then interpreted during traversal, mostly in a way which has equivalent power to LL(1) or LL with backtracking - and frequently without the advantages of a "proper" analysis, such as detection of grammar ambiguities that you need first and follow sets to find. They're very cute, IMO; an appealing piece of hackery. I wouldn't try and push them much beyond fairly small ad-hoc parsing problems though. At core, they are a form of metaprogramming, where the text of your combinator expressions is a program that writes a program, and the extra distance your code needs to go through to get to the hardware is liable to introduce problems: either unforseen issues with the library, unexpected overuse of CPU or memory, or hard to diagnose errors.

PEGs, meanwhile, are all-out backtracking exponential matchers that try and get back the performance by spending memory on memoizing. A PEG parser would be my last choice in writing a parser, unless I had to recognize a complex language with no tools and almost no time, but plentiful CPU and memory. With tools, I'd prefer GLR parsing if the problem is hard enough; with time, I'd prefer hand-written LL(k) recursive descent, which gives substantial flexibility. Most commercial compilers use hand-written recursive descent for a reason.

In summary, if you have good knowledge of even just LL(1) alone, you should be able to understand how both parser combinator libraries and PEG parsers work very easily, and also see why they can be problematic. So I wouldn't be too fast to dismiss the book as dated.


There is a little bit of information in the 2nd edition. Parser combinators get a few pages in "17.4.2 Functional Programming". PEGs and packrat parsing are briefly covered in "15.7 Recognition Systems".




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: