Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If I understand the gist of this article it goes like ...

1. Scanner divides source-code-string into ordered chunks each with some identifying information, what is the type and content of each chunk.

2. The next stage better NOT be a "Parser" but a "Reader" which assembles the chunks into a well-formed tree-structure thus recognizing which chunks belong togeether in the branches of such trees.

3. Parser then assigns "meaning" to the nodes and branches of the tree produced by Reader, by visiting them. "Meaning" basically means (!) what kind of calculation will be performed on some nodes of the tree.

4. It is beneficial if the programming language has primitives for accessing the output of the reader, so it can have macros that morph the reader-produced tree so it can ask the parser to do its job on such a re-morphed tree.

Did I get it close?



> 2. The next stage better NOT be a "Parser" but a "Reader" which assembles the chunks into a well-formed tree-structure thus recognizing which chunks belong togeether in the branches of such trees.

> 3. Parser then assigns "meaning" to the nodes and branches of the tree produced by Reader, by visiting them. "Meaning" basically means (!) what kind of calculation will be performed on some nodes of the tree.

So, an "AST builder" that is followed by a "semantic pass". That's... how most of the compilers have been structured, at least conceptually, since their invention. In particularly memory-starved environments those passes were actually separate programs, launched sequentially; most famously the ancient IBM FORTRAN compilers were structured like this (they couldn't manage fit both the program being compiled and the whole compiler into the core; so they've split the compiler into 60-something pieces).


It helps to read the article... the author was not introducing this as a novel concept, but elaborating on how this is a better mental model for how an interpreter or compiler works. It's not Tokenize -> Parse, it's Tokenize -> Read -> Parse.

The article discusses this particularly with regards to the meme of LISPs being "homoiconic". The author elaborates that the difference between LISPs and other programming languages lies actually not in "homoiconicity" (a Javascript string can contain a program, and you can run `eval` on it, hence Javascript is "homoiconic"), but in what step of the parsing pipeline they let you access: with Javascript, it's before Tokenization happens; with LISPs, it's after Reading happened, before the actual Parse step.


I've actually read the article, thank you; the author also argues that this "bicameral" style is what allows one to have useful tooling since it can now consume tree-like AST instead of plain strings. Unfortunately, that is not the unique advantage of "languages with bicameral syntax" although the author appears (?) to believe it to be so. The IDEs has been dealing with ASTs long before LSP has been introduced although indeed, this has only been seriously explored since the late nineties or so, I believe.

So here is a problem with the article: the author believes that what he calls "bicamerality" is unique to LISPs, and that it also requires some S-expr/JSON/XML-like syntax. But that's not true, isn't? Java, too, has a tree-like AST which can be (very) easily produced (especially when you don't care about the semantic passes such as resolving imports and binding names mentions to their definitions, etc.), and it has decidedly non-LISP-like syntax.

And no, I also don't believe the author actually cares all that much about the reader/parser/eval being available inside the language itself: in fact, the article is structured in a way that mildly argues against having this requirement for a language to be said to have "bicameral syntax".


    > So here is a problem with the article: the author  
    > believes that what he calls "bicamerality" is unique to  
    > LISPs, and that it also requires some S-expr/JSON/XML- 
    > like syntax.
I didn't find that assumption anywhere in the article. My reading is that all interpreters and compilers, for any language, are built to implement two non-intersecting sets of requirements, namely to "read" the language (build an AST) and to "parse" the language (check if the AST is semantically meaningful). Therefore, all language implementations require Tokenization, Reading and Parsing steps, but not all interpreters and compilers are structured in a way that cleanly separates the latter two of these three sets of concerns (or "chambers"), and (therefore) not all languages give the programmer access to the results of the intermediate steps. Java obviously has an AST, but a Java program, unlike a LISP program, can't use macros to modify its own AST. The programmer has no access to what the compiler "read" and can't modify it.


Mmmm. This article is like one of those duck-rabbit pictures, isn't it? With a slight mental effort, you can read it one way, or another way.

So, here are some excerpts:

    These advantages ("It’s a lot easier to support matching, indentation, coloring, and so on", and "tools hit the trifecta of: correct, useful, and relatively easy") are offset by one drawback: some people just don’t like them. It feels constraining to some to always write programs in terms of trees, rather than more free-form syntax.

    Still, what people are willing to embrace for writing data seems to irk them when writing programs, leading to the long-standing hatred for Lispy syntaxes.

    But, you argue, “Now I have a bicameral syntax! Nobody will want to program in it!” And that may be true. But I want you to consider the following perspective.

    [...] a bicameral syntax that is a very nice target for programs that need to generate programs in your language. This is no longer a new idea, so you don’t have to feel radical: formats like SMT-LIB and WebAssembly text format are s-expressions for a reason.
The last three paragraphs play upon each other: people hate Lispy syntax; people dislike bicameral syntaxes; S-expressions are bicameral syntax.

And notice that nothing in those excerpts and nothing in the text surrounding them (sections 4 to 7) really refers to the ability to access the program's syntax from inside the program itself. In fact, the sections 1 to 2 argue that such an ability is not really all that important and is not what makes LISPs LISPs. Then what does? The article goes on about "bicamerality" (explicit distinction between the reader and the parser) but doesn't ever mention again the ability of the program to modify its own syntax or eval.

I can't help but to make the tacit deduction that those never-again-mentioned things are not part of "bicamerality". You, perhaps, instead take those things as an implicit, never-going-out-of-sight context that is always implied to be important, so those things are never mentioned again because already enough has been said about them but they still are crucial part of "bicamerality".

It's a duck-reabbit article. We both perceive it very differently; perhaps in reality it's just an amalgam of ideas that, when mixed together in writing, lack the coherent meaning?


Yes, I understand your meaning now (and no longer understand the article's, which indeed seems to quack like a rabbit).


No, this isn't what the article says. I have not bothered saying anything about the "semantic pass", which is downstream from getting an AST. What the article talks about is not what "ancient IBM FORTRAN compilers" did.


The output of the Lisp reader is not an AST. It is completely unaware of many syntactical rules of the language, and is absent of any context. The equivalent in a C like language would be a stage that quite willingly generates a tree for the following:

  void foo(int int) {
    else {
      x = 3;
    }
  }
Which most compilers will never construct a tree for despite it following some unifying rules for the structure of code in a C-like language (braces and parentheses are balanced, statement has a semicolon after it, &c.).


Author here. Yes, very close. #4 is not a bit strong: there is value to doing this even if you don't have macros, for instance, because of other benefits (e.g., decent support from editors). But of course it also makes macros relatively easy and very powerful.


And what about homoiconity in Lisp vs. other lanaguages? In Lisp it means that programs are "lists" and so is "data". Programs in lisp are more than strings, like in most other languages, they are "nested lists". Lisps let us write prograssm as lists, adn store data as lists. JavaScript only allows us to write programs as (structureless) strings.

Of course that is well-known but I think it is a big deal, that you have such homo-iconicity in Lisp but no in most other languages. Prolog maybe?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: