Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

People balk at XML, but its verbosity plus DTD allows it to pull tricks which you can't do on other things.

Well everything has its place, but XML is I think very well suited where you need to serialize complex things to a readable file, and verify it while being it's written and read back.



Indeed. I get a lot of value out of my strongly typed XML documents. I generally have code that validates them during writing and after reading. Those who don’t understand XML end up learning why it is verbose when they eventually add all of the features they need to whatever half-baked format they are using.


The 'XML is verbose' argument is exactly analogous to the 'static typing is verbose' argument. JSON is decent, but it quickly breaks down if you want to have any sort of static sanitisation on input data, and the weird `"$schema"` attribute is quite strange. YAML makes no sense whatsoever to me.

XML is by far the most bulletproof human-readable serialisation-deserialisation language there is.


> The 'XML is verbose' argument is exactly analogous to the 'static typing is verbose' argument.

It’s two things: the static typing analog is definitely there but I’d extend the comparison to something like the J2EE framework fetish & user-hostile tools, too. There were so many cases where understanding an XML document required understanding a dozen semi-documented “standards” and since few of the tools actually had competent implementations you were often forced to write long-form namespace references in things like selectors or repeat the same code.

I worked with multiple people who were pretty gung ho about static typing everything but the constant friction of that self-inflicted toil wore over time. I sometimes wonder whether something more in the Rust spirit where the tools are smart enough not to waste your time might be more successful.


I agree. Here in 2024, I hope everyone agrees that types are great.

Static types, aren't just verbose, they're clunky. They only work in a perfect world - dynamic types provide the functionality to actually thrive.

> I sometimes wonder whether something more in the Rust spirit where the tools are smart enough not to waste your time might be more successful.

That could help, the problem being XML. You mention the J2EE framework and semi-documented "standards" - the world is rife with bad xml implementations, buggy xml implementations, and bad programmers reading 1 GB xml documents into memory (or programs needing to be re-worked to support a SAX parser).

There's too much baggage at the feet of XML, and the tools that maybe could have helped were always difficult to use/locked behind (absurdly expensive) proprietary paywalls.

JSON started to achieve popularity because as a format, it was relatively un-encumbered. Its biggest tie was to Javascript - if certain tools hadn't been brain-dead about rejecting JSON that wasn't strictly just JSON, it might have achieved same level of type safety as schema-validated XML, without much of the cruft. But that's not what the tools did, and so JSON became a (sort-of) human-readable data-interchange format, with no validation.

So in 2024 we have no good data-x-change formats, just random tools in little niches that make life better in your chosen poison format. We await a rust - a good format with speed, reliability, interoperability, extensibility, and easy-to-use tools/libraries built in.


I think PDML hits a sweet spot. The author didn't set out to recreate XML in a less verbose, more human readable syntax, but pretty much ended up doing so. I'd like to see it mature and gain more widespread adoption.


Agreed. XML is clunky, no doubt, but it's partly that the tools were just clunky.

Having said that, I do like that you can flip between YAML and JSON. If we could do that with XML (attributes vs sub-elements a problem here) it would be much more useful I think.


An XML document without a schema is strictly worse than JSON without a schema. JSON with a schema is strictly better than XML with a schema. XML structure does not map neatly into the data types you actually want to use. You do not want to use a tree of things with string attributes, all over your code. If you do have a schema, the first thing you will want to do is turn your data into native language data types. After that point, the serialization method does not matter anymore, and XML would have just be slower. Designing a schema for XML is also more tedious than for JSON.


> XML structure does not map neatly into the data types you actually want to use.

> After that point, the serialization method does not matter anymore, and XML would have just be slower.

Considering I have mapped 3D objects to (a lot of) C++ objects containing thousands of facets under 12ms incl. parsing, sanity checking, object creation, initialization and cross linking of said objects on last decade's hardware, I disagree with that sentiment.

Regarding your first point, even without a schema, an XML shows its structure and what it expects. So JSON feels its hacked together when compared to XML in terms of structure and expressiveness.

It's fine for serializing dark data where people won't see, but if eyes need to inspect it XML is way way more expressive by nature.

Heck, you even need to hack JSON for comments. C'mon :)


I enjoy JSON for internal stuff and where it does not matter that JSON is not very expressive. JSON Schema is a poor substitute for a proper schema. For anything where I am interfacing with another person or team, I send them a DTD or XSD, which documents the attributes and does not have nonsense like confusing integers and floating point values.

For quick and dirty, I agree about JSON. For serious data interchange, I use XML.


> JSON with a schema is strictly better than XML with a schema.

I am baffled by this assertion. XML Schema (XSD) is much more expressive than JSON Schema.


That is a bug, not a feature.


Not to me. I have lots of data exchanging going on where the format is expressed well in XSD and in JSON Schema it is expressed through documentation, code, and a history of angry emails.


If it can't be expressed as a JSON schema, it's a bad idea. If it can be expressed by a JSON schema, it may be a good idea.


This does not match my experience.

Can you provide some clear explanation or examples of why having less power to express a schema is desirable?


Paralysis of choice.


This remains an unconvincing argument.


Not everybody needs convincing. JSON already won.


It won in places where one does not need to express an integer larger than 2^53. Lots of us have harder requirements.


XML + DTD + XMLSchema had things we're still figuring out to do with YAML ja JSON

You could easily generate an UI based on just the DTD and Schema that could be used to fill a perfectly valid XML file.

Validating incoming XML was a breeze, just give it to the validator class along with the DTD and Schema and boom, done.


> Validating incoming XML was a breeze, just give it to the validator class along with the DTD and Schema and boom, done.

See the boom? It's boomer tech. We can't have old, boomer tech in 2024.

Jokes aside, I wish people spent the time to understand the technologies before disliking them and blindly implementing a different, inferior one.


XML is more popular today than it's ever been. It's just called JSX now.


Besides being aesthetically similar to SGML, because it maps to HTML, JSX has nothing to do with XML. It is Javascript.


It just looks like JavaScript version of JSP to be honest.


It's literally shorthand for "Javascript XML" and its templating syntax is the same as XML. It has a lot to do with XML.


JSX stands for JSX. Your definition is something that people just imagine to be true. The React docs do not mention the word XML at all. The “templating” syntax is not XML. It has no defined semantics and does not generally support crucial XML features like namespaces.


you obviously didn't spend two seconds researching this

https://en.wikipedia.org/wiki/JSX_(JavaScript)

https://facebook.github.io/jsx/


All of that is doable with JSON Schema, though, noy so sonething that we’re still figuring out how to so.


> It is specified in an Internet Draft at the IETF, currently in 2020-12 draft, which was released on January 28, 2021

It took them a few decades to catch up to XML. Support for Schema is still a bit dodgy in most big libraries.


Indeed, XML is a decent document language because of the quality of tools available and its power/flexibility. I hate when people use it for config files and other things that are usually human edited where readability is paramount though.


absolutely, 100%.

When I first encountered XSLT I seriously thought it was the most ridiculous thing I had ever seen. A frickin' programming language whose syntax was XML.

But then I learned it and I don't think I've ever seen another language that could do what XSLT could do in such a small amount of code. The trick was to treat it like a functional language (I got this advice from someone else and they were absolutely correct). Where most people got into trouble was thinking of it as an imperative language.

Pattern matching expressions is the kool kid on the block, but XSLT had that to the nth degree 20 years ago.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: