Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What do you mean exactly by "error-tolerance"? Is it like, each node is wrapped into a result type, that you have to match against each time you visit it, even though you know for a fact, that it is not empty or something like that?

I suppose that one of the pros of using tree-sitter is its portability? For example, I could define my grammar to both parse my code and to do proper syntax highlighting in the browser with the same library and same grammar? Is that correct? Also it is used in neovim extensively to define syntax for a languages? Otherwise it would have taken to slightly modify the grammar.





Oh nono, with tree-sitter, you get an untyped syntax tree. That means, you have a Cursor object to walk the tree, which creates Node objects as you traverse, that have a "kind" (name of the tree-sitter node), span, and children. (I recommend using the rust tree-sitter bindings itself, not the rust wrapper rust-sitter).

Yes, portability like that is a huge benefit, though I personally utilized it for that yet. I just use it as an error-tolerant frontend to my compiler.

As to how errors are reported, tree-sitter creates an ERROR or MISSING node when a particular subtree has invalid syntax. I've found that it never leaves a node in an invalid state, (so never would it create a binaryop(LeftNode(...), Op, ERROR) if RightNode is not optional. Instead it would create an ERROR for binaryop too. This allows you to safely unwrap known fields. ERROR nodes only really bunch up in repeat() and optional()s where you would implicity handle them.

For an example, I can only point you to my own use: https://github.com/pc2/sus-compiler

tree-sitter-sus has the grammar

sus-proc-macro has nice proc macros for dealing with it (kind!("binop"), field!("name"), etc)

src/flattening/parser.rs has conveniences like iterating over lists

and src/flattening/flatten.rs has the actual conversion from syntax tree to SUS IR


Error tolerance in this context means the parser produces a walkable AST even if the input code is syntactically invalid, instead of just throwing/reporting the error. It’s useful for IDEs, where the code is often in an invalid state as the developer is typing, but you still want to be able to report diagnostics on whatever parts of the code are syntactically valid.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: