Hacker News new | past | comments | ask | show | jobs | submit login
S-Expressions: The Fat-Free Alternative to JSON (shinkirou.org)
90 points by DanielRibeiro on Jan 20, 2012 | hide | past | favorite | 45 comments



JSON's popularity is largely due to it's ability to easily convert to native data structures regardless of what language you happen to be using. In fact, it's less of a data encapsulation format (like XML) and more of a serialization format.

If you need to markup your data, use a markup language like XML or S-Expressions. If you need to serialize data, JSON is an excellent choice.


native data structures regardless of what language you happen to be using

C, asm, ML, Lisp, Java, Tcl?


Did you have a point? Yes, JSON really is the best format for serializing data from any of those; for Tcl/Lisp/ML you might get lazy and use the native object format, but doing so will bite you sooner or later. For C/asm you might prefer ASN.1 but in my experience human-readability is worth the effort, and you can and should use a library rather than your own code for serialization. For Java it really is crystal clear; how else would you do it?


The last two letters of each acronym really do say a lot:

Javascript Object Notation

eXtended Markup Language


"Packed structs, the fat-free alternative to S-Expressions".

JSON's a nearly-perfect balance between human-readability, ease of generation/parsing, and data size. I'm not going to say that it can't get better, but I feel pretty okay saying that I don't feel a burning need to replace it.


JSON can be more compact if it simply used collapsible spaces to delimit elements within an array and key-value pairs within a dictionary. For example:

    [1,[2,3],4,{"a":5,"b":6}] // 25 characters
can be rewritten as:

    [1[2 3]4{"a":5 "b":6}]    // 22 characters
Also, currently, JSON requires all string literals to be quoted. If we relax that constraint and only quote string literals that contain escape characters, we can achieve even more space savings. The above example can be rewritten as:

    [1[2 3]4{a:5 b:6}]        // 18 characters
And, in my opinion, the formatted version (with uncollapsed spaces) is just as readable as the original JSON expression:

    [1 [2 3] 4 {a:5 b:6}]     // 21 characters


I guess maintaining compatibility with JavaScript is more important than minor space optimizations. Also, if you really care about such things, you shouldn't use JSON in the first place (see Protocol Buffers etc.)


I not advocating for changing the JSON spec. I'm saying, as a human-readable data format, it could be more compact. Perhaps this observation can be useful to those who are designing a new language.


If this format is intended to be primarily human-readable, why should compactness be so crucial? Looking at your example, I personally think that commas increase readability by visually separating individual elements (at the expense of space they take).


This HN post is a discussion about data expressions leaner than JSON. I believe the points I've made contributes to it and I've never said such compactness is more preferable. It's merely an observation.


For small one-off test cases like these, this looks like a major improvement (28%) but in practice, the network overhead will dwarf the transmission time for 25 bytes and so a reduction to 18 bytes will be neglible. Hell, the HTTP headers alone will be much larger.

If the data set is fairly large, then the markup will contribute much less to the overall size, so the improvement will probably be more like 5-10%. After HTTP compression, the difference might be negligible.

Testing should definitely be done to see if the space saving is worth the additional parsing overhead (relaxing constraints makes the language more complicated).

Same applies to the original post, especially since the examples seem even more contrived. E.g., the author goes from <p></p> in (X)HTML to delimit a paragraph to {"format":"Paragraph","content":[]} in JSON, when one could just as easily contrive <format type="paragraph"></format> in XML versus {"p":[]} to make JSON look better.


These are all the sorts of optimisations that are only really relevant for saving bytes over the wire. Since JSON is usually used over HTTP anyway Gzip/deflate will get 90% of the reduction anyway.


Another issue is serializing binaries. Sometime you just have to pass a binary through. I wish there was a rational way to do that with JSON without having to through base64.


And MongoDB's BSON does reasonably well at serialising data in a binary format when the receiver doesn't have prior knowledge of the structure.


JSON is nearly sexps already.

It's about as lispy you can get without being Lisp. The curly braces could be changed to parentheses and it would be rather trivial to write a Lisp macro to parse JSON into a nested set of associative arrays or hash maps.

In fact, if you just remove the JS ':' member notation, you can shove most of JSON into a Clojure repl already, since Clojure supports the convenient {} notation for maps.

There's no reason to make JSON look precisely like Lisp. Given that many people seem to be allergic to Lisp-style nested parentheses once they've been educated that it's Lisp, JSON offers the flexibility of constructing arbitrary sexp trees without people ever noticing they're almost generating and parsing Lisp.

When writing Lisp, it's of course sensible to use sexps instead of JSON for internal marshalling.


How can control structures be implemented with JSON, though?


The same as in LISP. Code being data and data being code, an 'if' is a function that merely returns/evaluates either second or third argument based on first argument evaluation.

Reminds me of this tiny LISP: http://www.brool.com/index.php/the-tiniest-lisp-in-python


Lispish: ( if ( > a b ) 'foo 'bar )

JSON: [[ "if" [ ">" a b ] "foo" "bar" ]

If you are writing code then I'd go for the Lisp approach, for general purpose data then these days I'd go with JSON.


And how can that S-EXP control structure be usable in all other scripting language beside Lisp?

If you can answer this question then you have already answered your own question.

But if your answer is "I don't have to because I use Lisp and I don't use other languages" then you miss the point of JSON.


I don't get the article: if you want to copy sexps in JSON why don't you just use JSON lists mapping simbols to strings? e.g.

    (p "Paragraph text here, " (b "bold text"))
becomes

    ['p','paragraph text here, ',['b','bold text']]
where space become commas, and '()' become '[]'.

The json example form html formatting is willingly verbose without reason to be.


The author's json example could have been something like:

  { 'p': ['Paragraph text here', { 'b': 'bold text' }] }
Still more verbose than the sexp example but not terribly so.


Yeah - I make my templates in much the style you describe, and render them with https://github.com/twfarland/don.


    (or (= 0 (+ x y)) (print "hi"))
    ['or', ['=', 0, ['+', 'x', 'y']], ['print', 'hi']]


Following the examples in the blog post, seems like the author would translate the following JSON expressions:

    {'a': 1}
    {'a': [1]}
    ['a' 1]
    ['a' [1]]
into this:

    ('a' 1)
Isn't that ambiguous? Or, do they need to be demarcated using special tokens (e.g. a for array and o for object) like this?

    {'a': 1}   => (o ('a' 1))
    {'a': [1]} => (o ('a' (a 1)))
If that's the case, nested arrays would require more characters to encode in S-expression than in JSON.

    ['a',['b',['c',['d',['e']]]]]
vs

    (a 'a' (a 'b' (a 'c' (a 'd' (a 'e')))))


You're correct, the major disadvantage here is the ambiguity not just in this JSON/S-expression translation but in any possible JSON/S-expression translation. Linear arrays and key-value dictionaries in most programming languages are intuitive and meaningful and usually even have a native data type. The conversion is usually very obvious. Not so much with S-expressions.

An even simpler problem is how do you express an empty array and an empty object in S-expressions? Is it () for both?

EDIT: and what about nulls? What about strings that resemble numbers?


Yes, there are certainly more problems. I think the reason why the author thought S-expressions are more compact is because the comparison was made between formatted versions of the expressions (how they would appear in code):

    ["a", "b", "c"]
    ("a" "b" "c")
Since people generally put spaces after their commas.


I did this in my code:

{'a': 1} -> (dict ('a' 1))

{'a': [1]} -> (dict ('a' (1)))

['a' 1] -> ('a' 1)

['a' [1]] -> ('a' (1))

['a',['b',['c',['d',['e']]]]] -> ('a' ('b' ('c' ('d' ('e')))))


Quoting Greenspuns tenth law:

  Any sufficiently complicated C or Fortran program
  contains an ad-hoc, informally-specified, bug-ridden,
  slow implementation of half of CommonLisp.


And then:

    Any sufficiently complicated CommonLisp
    program contains an ad-hoc, informally-specified,
    bug-ridden, slow implementation of half of Prolog.


One difference - the former is true, the latter, not so much.


Corollary:

  Including Common Lisp.


As alluded to by Steve Yegge in a blog post in 2005 (http://sites.google.com/site/steveyegge2/the-emacs-problem), if you are using a Lisp, storing data as sexp means that you can define the appropriate functions, and "execute" the data to transform it however you like.


And here I thought JSON was the fat-free alternative to.. anything except binary formats :-)


Of course if you know your dataset and can just use split (or explode) with a single delimiter (ie. |) that's always going to be much much faster and compact.


That would only work for one-level lists of values, not general data structures like lists inside lists, dictionaries, in arbitrary nesting etc.

It's not really a general serialization solution at all.


> Introducing S(ymbolic)-Expressions ...

I laughed at that. Lisp-ers might want to have a word with him.


we just finally got away from XML for good and JSON is great. last thing we need is another alternative. let's try to create standards and stop coming up w/ exotic languages so we can put our efforts into proper development.


i'm waiting for the i-can't-believe-it's-not-xml! xml


In a few years I bet JavaScript developers will discover the simplicity of tab-separated-varables.


Of course.. because a simple list is of course better than arbitrary nested data structures which are then directly translated into nested object structures.


Whoever does not understand JSON, is doomed to reinvent it.


I think you have that backwards.


I don't understand how anyone could think that I am not aware of the original expression which I have paraphrased. As others have noted, JSON is a local optima solution to a particular group of problems. Using S-Expressions does not improve on that solution, they in fact are less of a good fit. The same way that many misunderstand Lisp, thinking that this would be a better fit to the problems JSON addresses, misunderstands JSON (or those problems).

Well, I thought it was a clever joke, anyway.


My apologies, I took your comment a bit too seriously. I guess it was too clever of a joke.


"Whoever does not re-invent JSON is doomed to understand it" ?




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: