Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I liked the first half of the article, but I'm not sure I got anything from the second half. As the author notes, in order to be useful a definition must exclude something, and the "bicameral" distinction doesn't seem to exclude anything; even Python eventually gets parsed into a tree. Conceptually splitting out "parsing" into "tree validation" and "syntax validation" is slightly interesting (although isn't this now a tricameral system?), but in practice it just seems like a simple aid to constructing DSLs.

> These advantages are offset by one drawback: some people just don’t like them. It feels constraining to some to always write programs in terms of trees, rather than more free-form syntax.

I think this is misdiagnosing why many people are averse to Lisp. It's not that I don't like writing trees; I love trees for representing data. But I don't think that thinking of code as data is as intuitive or useful as Lisp users want me to think it is, despite how obviously powerful the notion is.



I also struggled with the "bicameral" definition. The best I could come up with is that because e.g. Scheme represents code and and data in the same way (isn't there a word for this?) it's possible to represent and manipulate (semantically) invalid code. This is because the semantics are done in the other "chamber". The example given was `(lambda 1)` which is a perfectly good sexp, but will error if you eval it.

This could be contrasted with C where code (maybe more precisely program logic) is opaque (modulo preprocessor) and can only be represented by function pointers (unless you're doing shellcode). Here the chamber that does the parsing from text (if we don't look inside GCC) also does semantic "checking" and so while valid functions can be represented within C (via the memory contents at the function pointer), the unchecked AST or some partial program is not represented.

I've tried not to give too many parentheticals above, but I'm not sure the concept holds water if you play tricks. Any Turing machine can represent any program, presumably in a way that admits cutting it up into atoms and rearranging to an arbitrary (potentially invalid) form. I'd be surprised if this hasn't been discussed in more detail somewhere in the literature.

This


It excludes languages that build a single AST directly from tokens. I am pretty sure Clang is like this, and probably v8. (They don't have structured macros, so it's not observable by users.)

As opposed to building first an untyped CST (concrete syntax tree), and then transforming that into a typed AST.

CPython does exactly this, but it has no macro stage either, so it's not exposed to users. (Python/ast.c is the CST -> AST transformation. It transforms an untyped tree to a typed tree.)

So the key reason it matters is that it's a place to insert the macro stage.

---

I agree that the word "bicameral" is confusing people, but it basically means "reader --> parser" as opposed to just "parser".

The analogies in the article are very clear to me -- in this world, JSON and XML parsers are "readers", but they are NOT "parsers"! (and yes that probably confuses many people, some new words could be necessary)

The JSON Schema or XML Schema would be closer to the parser -- it determines whether you have a "for loop" or "if statement", or an "employee" and "job title", etc.

Another clarifying comment - https://lobste.rs/s/ici6ek/bicameral_not_homoiconic#c_bmx0vf


I'll also argue that the ideas in this post absolutely matter in practice.

For example, Github Actions uses YAML as its Reader / S-expression / CST layer.

And then it has a separate "parser", for say "if" nodes, and then another parser for the string value of those "if" nodes.

https://docs.github.com/en/actions/writing-workflows/workflo...

    if: ${{ ! startsWith(github.ref, 'refs/tags/') }}

    if: github.repository == 'octo-org/octo-repo-prod'
This fact is poorly exposed to users:

You must always use the ${{ }} expression syntax or escape with '', "", or () when the expression starts with !, since ! is reserved notation in YAML format.

So I feel that they could have done a better job with language design by taking some lessons from the past.

Gitlab has the same kind of hacky language on top of YAML as far as I remember




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: