S-Expressions: The Fat-Free Alternative to JSON

maratd · on Jan 20, 2012

JSON's popularity is largely due to it's ability to easily convert to native data structures regardless of what language you happen to be using. In fact, it's less of a data encapsulation format (like XML) and more of a serialization format.

If you need to markup your data, use a markup language like XML or S-Expressions. If you need to serialize data, JSON is an excellent choice.

anon_d · on Jan 20, 2012

native data structures regardless of what language you happen to be using

C, asm, ML, Lisp, Java, Tcl?

lmm · on Jan 20, 2012

Did you have a point? Yes, JSON really is the best format for serializing data from any of those; for Tcl/Lisp/ML you might get lazy and use the native object format, but doing so will bite you sooner or later. For C/asm you might prefer ASN.1 but in my experience human-readability is worth the effort, and you can and should use a library rather than your own code for serialization. For Java it really is crystal clear; how else would you do it?

bunderbunder · on Jan 21, 2012

The last two letters of each acronym really do say a lot:

Javascript Object Notation

eXtended Markup Language

cheald · on Jan 20, 2012

"Packed structs, the fat-free alternative to S-Expressions".

JSON's a nearly-perfect balance between human-readability, ease of generation/parsing, and data size. I'm not going to say that it can't get better, but I feel pretty okay saying that I don't feel a burning need to replace it.

buddydvd · on Jan 20, 2012

JSON can be more compact if it simply used collapsible spaces to delimit elements within an array and key-value pairs within a dictionary. For example:

    [1,[2,3],4,{"a":5,"b":6}] // 25 characters

can be rewritten as:

    [1[2 3]4{"a":5 "b":6}]    // 22 characters

Also, currently, JSON requires all string literals to be quoted. If we relax that constraint and only quote string literals that contain escape characters, we can achieve even more space savings. The above example can be rewritten as:

    [1[2 3]4{a:5 b:6}]        // 18 characters

And, in my opinion, the formatted version (with uncollapsed spaces) is just as readable as the original JSON expression:

    [1 [2 3] 4 {a:5 b:6}]     // 21 characters

adambyrtek · on Jan 20, 2012

I guess maintaining compatibility with JavaScript is more important than minor space optimizations. Also, if you really care about such things, you shouldn't use JSON in the first place (see Protocol Buffers etc.)

buddydvd · on Jan 20, 2012

I not advocating for changing the JSON spec. I'm saying, as a human-readable data format, it could be more compact. Perhaps this observation can be useful to those who are designing a new language.

adambyrtek · on Jan 20, 2012

If this format is intended to be primarily human-readable, why should compactness be so crucial? Looking at your example, I personally think that commas increase readability by visually separating individual elements (at the expense of space they take).

buddydvd · on Jan 20, 2012

This HN post is a discussion about data expressions leaner than JSON. I believe the points I've made contributes to it and I've never said such compactness is more preferable. It's merely an observation.

kbolino · on Jan 20, 2012

For small one-off test cases like these, this looks like a major improvement (28%) but in practice, the network overhead will dwarf the transmission time for 25 bytes and so a reduction to 18 bytes will be neglible. Hell, the HTTP headers alone will be much larger.

If the data set is fairly large, then the markup will contribute much less to the overall size, so the improvement will probably be more like 5-10%. After HTTP compression, the difference might be negligible.

Testing should definitely be done to see if the space saving is worth the additional parsing overhead (relaxing constraints makes the language more complicated).

Same applies to the original post, especially since the examples seem even more contrived. E.g., the author goes from <p></p> in (X)HTML to delimit a paragraph to {"format":"Paragraph","content":[]} in JSON, when one could just as easily contrive <format type="paragraph"></format> in XML versus {"p":[]} to make JSON look better.

chrisfarms · on Jan 20, 2012

These are all the sorts of optimisations that are only really relevant for saving bytes over the wire. Since JSON is usually used over HTTP anyway Gzip/deflate will get 90% of the reduction anyway.

rdtsc · on Jan 20, 2012

Another issue is serializing binaries. Sometime you just have to pass a binary through. I wish there was a rational way to do that with JSON without having to through base64.

alexchamberlain · on Jan 20, 2012

And MongoDB's BSON does reasonably well at serialising data in a binary format when the receiver doesn't have prior knowledge of the structure.

yason · on Jan 20, 2012

JSON is nearly sexps already.

It's about as lispy you can get without being Lisp. The curly braces could be changed to parentheses and it would be rather trivial to write a Lisp macro to parse JSON into a nested set of associative arrays or hash maps.

In fact, if you just remove the JS ':' member notation, you can shove most of JSON into a Clojure repl already, since Clojure supports the convenient {} notation for maps.

There's no reason to make JSON look precisely like Lisp. Given that many people seem to be allergic to Lisp-style nested parentheses once they've been educated that it's Lisp, JSON offers the flexibility of constructing arbitrary sexp trees without people ever noticing they're almost generating and parsing Lisp.

When writing Lisp, it's of course sensible to use sexps instead of JSON for internal marshalling.

itmag · on Jan 20, 2012

How can control structures be implemented with JSON, though?

lloeki · on Jan 20, 2012

The same as in LISP. Code being data and data being code, an 'if' is a function that merely returns/evaluates either second or third argument based on first argument evaluation.

Reminds me of this tiny LISP: http://www.brool.com/index.php/the-tiniest-lisp-in-python

arethuza · on Jan 20, 2012

Lispish: ( if ( > a b ) 'foo 'bar )

JSON: [[ "if" [ ">" a b ] "foo" "bar" ]

If you are writing code then I'd go for the Lisp approach, for general purpose data then these days I'd go with JSON.

joesb · on Jan 20, 2012

And how can that S-EXP control structure be usable in all other scripting language beside Lisp?

If you can answer this question then you have already answered your own question.

But if your answer is "I don't have to because I use Lisp and I don't use other languages" then you miss the point of JSON.

riffraff · on Jan 20, 2012

I don't get the article: if you want to copy sexps in JSON why don't you just use JSON lists mapping simbols to strings? e.g.

    (p "Paragraph text here, " (b "bold text"))

becomes

    ['p','paragraph text here, ',['b','bold text']]

where space become commas, and '()' become '[]'.

The json example form html formatting is willingly verbose without reason to be.

hakanensari · on Jan 20, 2012

The author's json example could have been something like:

  { 'p': ['Paragraph text here', { 'b': 'bold text' }] }

Still more verbose than the sexp example but not terribly so.

twfarland · on Jan 20, 2012

Yeah - I make my templates in much the style you describe, and render them with https://github.com/twfarland/don.

anon_d · on Jan 20, 2012

    (or (= 0 (+ x y)) (print "hi"))
    ['or', ['=', 0, ['+', 'x', 'y']], ['print', 'hi']]

buddydvd · on Jan 20, 2012

Following the examples in the blog post, seems like the author would translate the following JSON expressions:

    {'a': 1}
    {'a': [1]}
    ['a' 1]
    ['a' [1]]

into this:

    ('a' 1)

Isn't that ambiguous? Or, do they need to be demarcated using special tokens (e.g. a for array and o for object) like this?

    {'a': 1}   => (o ('a' 1))
    {'a': [1]} => (o ('a' (a 1)))

If that's the case, nested arrays would require more characters to encode in S-expression than in JSON.

    ['a',['b',['c',['d',['e']]]]]

vs

    (a 'a' (a 'b' (a 'c' (a 'd' (a 'e')))))

qntm · on Jan 20, 2012

You're correct, the major disadvantage here is the ambiguity not just in this JSON/S-expression translation but in any possible JSON/S-expression translation. Linear arrays and key-value dictionaries in most programming languages are intuitive and meaningful and usually even have a native data type. The conversion is usually very obvious. Not so much with S-expressions.

An even simpler problem is how do you express an empty array and an empty object in S-expressions? Is it () for both?

EDIT: and what about nulls? What about strings that resemble numbers?

buddydvd · on Jan 20, 2012

Yes, there are certainly more problems. I think the reason why the author thought S-expressions are more compact is because the comparison was made between formatted versions of the expressions (how they would appear in code):

    ["a", "b", "c"]
    ("a" "b" "c")

Since people generally put spaces after their commas.

rdtsc · on Jan 20, 2012

I did this in my code:

{'a': 1} -> (dict ('a' 1))

{'a': [1]} -> (dict ('a' (1)))

['a' 1] -> ('a' 1)

['a' [1]] -> ('a' (1))

['a',['b',['c',['d',['e']]]]] -> ('a' ('b' ('c' ('d' ('e')))))

plaes · on Jan 20, 2012

Quoting Greenspuns tenth law:

  Any sufficiently complicated C or Fortran program
  contains an ad-hoc, informally-specified, bug-ridden,
  slow implementation of half of CommonLisp.

rdtsc · on Jan 20, 2012

And then:

    Any sufficiently complicated CommonLisp
    program contains an ad-hoc, informally-specified,
    bug-ridden, slow implementation of half of Prolog.

anamax · on Jan 20, 2012

One difference - the former is true, the latter, not so much.

koenigdavidmj · on Jan 20, 2012

Corollary:

  Including Common Lisp.

notaddicted · on Jan 20, 2012

As alluded to by Steve Yegge in a blog post in 2005 (http://sites.google.com/site/steveyegge2/the-emacs-problem), if you are using a Lisp, storing data as sexp means that you can define the appropriate functions, and "execute" the data to transform it however you like.

sgt · on Jan 20, 2012

And here I thought JSON was the fat-free alternative to.. anything except binary formats :-)

ck2 · on Jan 20, 2012

Of course if you know your dataset and can just use split (or explode) with a single delimiter (ie. |) that's always going to be much much faster and compact.

batista · on Jan 20, 2012

That would only work for one-level lists of values, not general data structures like lists inside lists, dictionaries, in arbitrary nesting etc.

It's not really a general serialization solution at all.

rdtsc · on Jan 20, 2012

> Introducing S(ymbolic)-Expressions ...

I laughed at that. Lisp-ers might want to have a word with him.

jaequery · on Jan 20, 2012

we just finally got away from XML for good and JSON is great. last thing we need is another alternative. let's try to create standards and stop coming up w/ exotic languages so we can put our efforts into proper development.

lani · on Jan 20, 2012

i'm waiting for the i-can't-believe-it's-not-xml! xml

wavephorm · on Jan 20, 2012

In a few years I bet JavaScript developers will discover the simplicity of tab-separated-varables.

fforw · on Jan 20, 2012

Of course.. because a simple list is of course better than arbitrary nested data structures which are then directly translated into nested object structures.

mambodog · on Jan 20, 2012

Whoever does not understand JSON, is doomed to reinvent it.

spacemanaki · on Jan 20, 2012

I think you have that backwards.

mambodog · on Jan 20, 2012

I don't understand how anyone could think that I am not aware of the original expression which I have paraphrased. As others have noted, JSON is a local optima solution to a particular group of problems. Using S-Expressions does not improve on that solution, they in fact are less of a good fit. The same way that many misunderstand Lisp, thinking that this would be a better fit to the problems JSON addresses, misunderstands JSON (or those problems).

Well, I thought it was a clever joke, anyway.

spacemanaki · on Jan 21, 2012

My apologies, I took your comment a bit too seriously. I guess it was too clever of a joke.

OliverM · on Jan 20, 2012

"Whoever does not re-invent JSON is doomed to understand it" ?