JSON isn't a good configuration language

GauntletWizard · on July 18, 2018

I want to point out Jsonnet[1] - It's a language that renders to JSON, it's simple to write, functional, and has good tooling for Kubernetes[2].

[1]https://jsonnet.org/learning/tutorial.html [2]https://ksonnet.io/

rojopolis · on July 19, 2018

We're using jsonnet for our config DSL here: https://github.com/rojopolis/crt So far, it looks extremely promising. I believe the Kubernetes extensions also allow it to import YAML.

tappleby · on July 19, 2018

We have been using jsonnet for Amazon ECS service and task definitions, works quite well.

dacompton · on July 19, 2018

Does it look anything like https://github.com/dan-compton/ecsonnet

How are you managing the registration of service and task definitions?

TanakaTarou · on July 19, 2018

I surprises me that the author doesn't list XML as an option. XML has been used for configuration for 15+ years where I work with little or no issues. It is easy to use and easy to read and because of that I will continue to prefer XML over the alternate format listed in the article.

geezerjay · on July 19, 2018

> I surprises me that the author doesn't list XML as an option.

The author points out that in his opinion the fact that JSON objects are sandwiched between brackets makes the language overly verbose. By the same reasoning this makes XML an unusable absurd mess of a language, even without taking into account all the boilerplate code.

Mikhail_Edoshin · on July 19, 2018

Well, when it comes to object notation XML is actually less noisy than JSON. It has more characters (maybe), but fewer special symbols. For example, let's imagine a notation to describe a chess party:

    <party xmlns="urn:mycompany:games:Chess">
      <move white="e4" black="e5" />
      <move ... />
    </party>

    { "__type": "urn:mycompany:games:Chess:Party",
      "moves" : [
        {"white": "e4", "black": "e5" },
        { ... }
       ]
    }

Compare the lines to describe the moves:

    <move white="e4" black="e5" />
    {"white": "e4", "black": "e5" },

The one for JSON is actually two characters longer and (in my opinion) is harder to read because of all these noisy symbols to enclose and separate values. Lets' look at them in isolation:

    XML  <=""=""/>
    JSON {"":"","":""},

This is indeed boilerplate.

Also note that XML includes the element name, while with JSON we are supposed to know the object structure. (And while I'm at it note that this XML name, despite being just a generic "move", is actually globally unique so when you process XML with any tools you'll never mess up two different "move" elements from different XML dialects. With JSON our best shot is to have an in-house convention of how we mark object types and roll our own tools to check this.)

At the end JSON kind of becomes more concise compared to XML:

    ]}
    </party>

I don't think it's such a huge win, especially because these are again special symbols. Besides, being overly concise is not always a virtue. Compare, for example, the syntax of functions in Ada:

    procedure Test
    ...
    begin
    ...
    end Test;

Note that the designers of Ada decided to require a qualifier after the generic "end" to clearly mark it as the end of a specific procedure. It's interesting to read Ada Design Goals[1], which clearly mentions this in the section "Concern for programmer: Language is small and consistent. Readability favored over writability".

JSON has its strong points compared to XML, but for some reason every discussion about XML is riddled with over-the-top negative generalizations like "unusable absurd mess of a language", while in fact XML is a very solidly engineered thing. Just like Ada :)

[1]: https://www.radford.edu/~nokie/classes/320/designGoals.html

geezerjay · on July 19, 2018

> For example, let's imagine a notation to describe a chess party:

In that example XML is missing the XML declaration. Furthermore, JSON doesn't require a type field. That skews the comparison quite a bit.

Furthermore, the amount od symbols is irrelevant. Otherwise brainfuck would be considered exemplar wrt clarity.

> Also note that XML includes the element name, while with JSON we are supposed to know the object structure.

Exactly, that's another way JSON is simpler than XML as it requires no boilerplate code to convey structure.

That is particularly obvious in the sort of application covered in this discussion.

Mikhail_Edoshin · on July 19, 2018

XML declaration is optional (in 1.0, which everybody uses). My point was not that JSON is longer; JSON is noisier. XML in this example is longer, but easier to read because it has more content and less visual and cognitive overhead to convey the same structure (plus extra: element name). Some structural symbols are necessary, of course; we cannot just write "whitee4blacke5" (we can, in fact, it's parseable), but JSON uses more than XML.

XML element names are not boilerplate, they're type names. Maybe configs do not need types, but who knows; maybe they do. Types are there for a reason. E.g. if users write configs, it would be totally cool to verify they're at least syntactically correct before using them. With JSON there's no standard way to do this; with XML there is. (And users write configs all the time; I, for example, being a user of Make and Apache routinely write Makefiles and .htaccess files.)

nineteen999 · on July 19, 2018

I have entire environments marked up as XML, including application servers and various parameters, database clusters and their replication parameters/partners etc.

This feeds into automation for everything from inventory/hostvars for ansible, DNS zone file contents, DHCP configuration, placement of VM's on VMware hosts etc. Clear yet flexible structure and the ability to have attributes on the nodes itself as well as data within them makes it so much more versatile than YAML or JSON for me.

Add to that XPath queries etc. xmllint/xmlstarlet/python+lxml are so useful, every time I try to process JSON with tools like jq I feel like I am going crosseyed.

edejong · on July 19, 2018

Yes, and editors with tab completion and validation based on XML schemas.

thayne · on July 25, 2018

I didn't mention XML for a couple of reasons:

1. As others mentioned it can be pretty verbose (especially with doctypes and namespaces) 2. Potential security concerns (https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Pr...) although that doesn't matter so much if the configuration is trusted (not always the case though) and can be mitigated if you know what you are doing.

passivepinetree · on July 18, 2018

My workplace is bought pretty heavily into Azure, which I generally love. However, the specification for any given Azure resource is stored in a JSON file called an ARM (Azure Resource Management) template. This is usually okay, except for two main problems.

One is the lack of comments that is always brought up when criticizing JSON.

Two is the fact that Azure provides a way to access certain keys, IDs, etc. of resources in an ARM template, but it's very clunky. It's basically a specification for a service call, often involving mod operators or looping, in a single line of JSON. Such specifications are unreadable and unavoidable when provisioning any mildly complicated Azure resources, and they make ARM templates much more difficult to maintain.

If it's wanted, I can provide examples of what I mean.

alphabettsy · on July 18, 2018

TOML is probably my favorite.

stock_toaster · on July 19, 2018

I personally like ucl[1], but toml is pretty decent too.

[1]: https://github.com/vstakhov/libucl

mlthoughts2018 · on July 19, 2018

JSON is a great configuration language. I find it far more human-readable than Toml or similar ini-styles, while being more restricted than YAML.

Since config files ought to be version controlled outside of the application and delivered as an installable artifact (yes even your local Postgres conf file, etc.), it means end users can customize options solely using the 12-factor approach of storing config in environment variables, and never fishing around to change them in a hand-edited config file.

This pattern is really pleasant to work with, especially using container definitions or shell scripts to define different combinations of config (as ENV vars), instead of needing to maintain a mapping between random and often scattered files into their meanings for reproducing certain environments or settings.

A nice side effect is that you’d truly never want or need comments in the config file. They would be in a man page, user guide, etc., explaining why defaults are chosen and how to override with ENV vars (and you’re still free to add comments to environment definitions like Dockerfiles or shell scripts).

After working this way for a while now, I’ve come around to realizing it is so much superior to relying on modify local files and especially better than needing to fish around for in-line config comments (that are just as prone to get out of sync as anything else) in scattered config files. I was resistant to the 12 Factor config best practices for a while, and I think I was foolish for it.

In the end, JSON is really compact, scoping and nesting are immediately easy to read, data structures are immediately obvious, and there is a ton of mature tooling for it.

JSON really is a great config language.

pmoriarty · on July 18, 2018

YAML is even worse. Meaningful whitespace should die.

jdenning · on July 18, 2018

YAML is a superset of JSON, so you can use JSON-style braces if you prefer. Just FYI.

krapp · on July 18, 2018

JSON with significant whitespace is just everything bad about JSON and a little bit worse, though.

tenryuu · on July 19, 2018

But YAML supports comments multiline. You can't just say everything bad about JSON while neglecting the article's intentions in the flaws of JSON

geezerjay · on July 19, 2018

The lack of comments is not a flaw but one of JSON's many virtues which takes into account the many lessons in life on how comments are used ans abusef to extend a basic format with ad-hoc directives. Evenso, anyone who wishes to support comments in JSON is free to do so by employing a filter.

krapp · on July 19, 2018

If we removed every useful feature of a language because it could potentially be abused, we wouldn't have any features in any languages.

Removing comments from JSON because some people might use them for parsing directives was an overreaction. The alternative always presented is to pollute your objects with "comment" values which, unlike actual comments, take up space and need to be parsed or filtered. Just imagine how silly it would be in any other language for comments to be replaced with variables.

And... nothing is stopping people from adding parsing directives in JSON itself anyway.

evil-olive · on July 19, 2018

IMO, the only thing more confusing than a complex YAML file is a half-YAML, half-JSON hybrid.

geezerjay · on July 19, 2018

> YAML is a superset of JSON

Actually it isn't. That was left as an afterthought and IMHO as a trick to try to ride the coat tails of JSON. YAML in the wild is a god-awful mess that bears no resemblance to JSON. To me YAML's only purpose is to enable lazy people like me to take notes in a text editor with code folding and syntax highlighting.

likeclockwork · on July 19, 2018

That sounds like a light version of org-mode.

thesephist · on July 19, 2018

Like the article briefly mentions at the end, I'm really a fan of Webpack's approach to configuration -- exporting the configuration from an executable script. It gets around most of the problems outlined here (numbers, comments, multiline strings...) while keeping all the JSON niceties. It no longer becomes cross-language, but seems like exporting a config object from an executable is a good alternative.

sbok · on July 19, 2018

I agree. Webpack's approach is great, but only for JS projects.

For Scala (or any other JVM environment) I use https://github.com/lightbend/config. It doesn't get better than that.

fusl · on July 19, 2018

This post is wrong on soooo many levels, mentioning all of them would explode this comment, but just to proof one of them: JSON does NOT "requires curly braces around the entire documen". In fact, you can enclose a document with [ and ] as well, which is why you need curly brackets to write configuration files. This has nothing to do with a JSON "require"ment.

cdmckay · on July 19, 2018

Wrapping the document in [] wouldn’t be very useful for config. So, what’s your point exactly?

Is that really the worst thing you can find in this post?

harg · on July 19, 2018

Just a primitive like a string, number, boolean etc is also valid json.

thayne · on July 25, 2018

What does that have to do with using JSON as a configuration format? Having a single primitive (or for that matter an array) is not very useful as a configuration file.

singularity2001 · on July 19, 2018

{ json5:"should be the new standard" //really/ }

dogma1138 · on July 18, 2018

Any configuration language which is not easily humanly readable is not a good option in my opinion.

There is nothing wrong with JSON as an intermediary format it just should not be used as the base format to store the configuration information in.

krapp · on July 18, 2018

I absolutely support using Lua for a configuration language.

Yes, you have to load a parser, but it's tiny, and the syntax is everything people like about JSON but without any of the listed drawbacks, and even simpler.

sbjs · on July 19, 2018

Lua was created as a (dynamic) configuration language and I think it’s great for that. It is as full featured as any modern language, complete with lexical functions, and the only thing it lacks is a thriving ecosystem. Its current ecosystem is not as enthusiastic and active as most others.

PenguinCoder · on July 19, 2018

Lua is an amazing language, nearly full featured, and as you say, tiny. Anything I code that needs some type of built in flexible system configuration or embedded scripting, I try and use Lua for. Wish it were more prevalent in some circles.

greencurry43 · on July 18, 2018

I wrote a little library for building DSLs in JavaScript [0] because of this very issue—JSON is not good for this kind of stuff. I want to be able to create small DSLs to solve problems rather than squeezing the language into a JSON format. I want the full power of a language AND the ability to serialize the semantics to send over the wire. Maybe we'll move toward that one day.

[0] https://github.com/smizell/treebranch

krapp · on July 18, 2018

But you shouldn't want the full power of a Turing complete language in a config language. All of that belongs in application-space, configuration is supposed to be simple and static, with as little logic as possible, preferably none.

greencurry43 · on July 19, 2018

The difference is that I am not proposing a Turing complete config language, but rather proposing to build configurations with plain old JavaScript (or whatever language you want). I think a good read about this thinking is by Martin Fowler on Language Oriented Programming [0], specially the section about internal DSLs.

Another interesting read is around the configuration complexity clock [1], in that over time we move from hard coding things to building configurations to coming full circle and hard coding things again. I like to think internal DSLs closes that loop well.

[0] https://www.martinfowler.com/articles/languageWorkbench.html

[1] http://mikehadlow.blogspot.com/2012/05/configuration-complex...

sbjs · on July 19, 2018

That’s exactly what Lua was created for, to be a Turing complete configuration language. You can remove the whole standard library though, leaving only the built in operators, variables, functions, and control flow. To some it may seem ugly to define a configuration this way but it allows configurations to reduce duplication and add more dynamism. It’s a trade for, power for simplicity. But I guess the logical end of that is to end up evolving the configuration into a scripting language which is how Lua got where it is today.

krapp · on July 19, 2018

>You can remove the whole standard library though, leaving only the built in operators, variables, functions, and control flow.

I just choose not to use them, and limit myself to tables or an API that generates tables when using it for config files.

amriksohata · on July 19, 2018

Yaml tab formatting is terrible

score127 · on July 18, 2018

[flagged]

mlthoughts2018 · on July 19, 2018

I strongly agree with you and tried to defend my view just a few days ago: < https://news.ycombinator.com/item?id=17526792 >.

Unfortunately HN is a bit of an echo chamber bandwagon for Toml lately.

evil-olive · on July 19, 2018

Can you point me to an example of what you'd consider a good configuration file? Something that doesn't have comments, and wouldn't be improved by adding any?

mlthoughts2018 · on July 19, 2018

This project is useful as a toy reference for the 12 factor principles, including making the app totally isolated from any config file (all config must be consumed from the environment). In such a case, you would document artifacts that assemble different environments, with usage instructions outside of any config / container definition / shell script / etc.

https://github.com/WASdev/sample.microservices.12factorapp

evil-olive · on July 19, 2018

> making the app totally isolated from any config file (all config must be consumed from the environment)

Wait, so is it "config files shouldn't have comments" or "config files shouldn't exist at all"?

If all configuration is consumed from the environment, where are the values of the environment variables defined? eg, if instead of {"enable_foo": true} in a JSON config file you have an ENABLE_FOO environment variable, where does the value ENABLE_FOO=1 live? A shell script, a config database such as Consul or etcd, or something else?

It can't be turtles all the way down - at some point you have to have the configuration stored somewhere, whether that's a file or a different abstraction like a database. Wherever that storage location is, why are human-readable comments alongside the machine-readable values a bad thing?

mlthoughts2018 · on July 19, 2018

Nobody’s saying there would be no files anywhere, just not in the application itself. Config files would would be like third party packages that get installed by an evironment orchestration tool (container, chef, puppet, whatever) for purposes of defining an environment.

This way the “API” for a third party providing custom overrides at runtime to config is through ENV vars, not editing files. (Creating a modified config file would be for purposes of creating a whole new installable package that third parties can use. Whoever config files are shipped to, sys admins, end users, yourself on your local laptop, those users would never modify the config file — it would be immutable from their point of view and only extended by ENV).

So yes, some files store config somewhere to define environments, but what I’m saying is that when these files are treated as separate installable artifacts outside the app itself, then placing documentation into those config files is unwanted, even an anti-pattern because you’d want them to be in a user guide for people to know how to choose between config files and the use case at a high level, that’s the right place to document why foo is set to 57 or something.

Putting human readable comments into the actual environment defining config file is bad because these files can become scattered and out of sync, and it encourages reading the file itself as a source of information instead of central usage guidelines. If you also don’t let people inject custom config into the file (use ENV instead) then there’s no reason for the extra hassle, risk of getting out of sync, or risk of mixed signals that it’s ok to edit the file locally.