And the problem would not exist if we just used the good ol' s-expressions. Gets all the structural benefits of XML and JSON with few of the drawbacks.
The problems people have with XML is not the syntax. Also, how would you represent (optional) attributes in s-expression? I could think of a couple ideas but none that is nice.
Basic S-expressions also don't distinguish between lists and maps, which is something that turns out to be very convenient in practice. Sure, a map is just a list of pairs - but the deserializer needs to be aware of its meaning to parse it into the appropriate data structure. So you either need a schema even for the most trivial cases, or you need a distinct syntax.
Basic S-exp syntax can easily be extended to denote dictionaries. Just like #(...) gives us vectors and #S structures, some #H can provide hash tables.
If you really needed, you could just define that the second list element is an attribute list/map, turning <foo bar="baz"><quux /></foo> into (foo ((bar "baz")) (quux)).
The problem with XML itself is being unnecessarily verbose (and thus difficult for both human and machine to read) for what's just a way to encode trees. Attributes are arguably XML's self-inflicted gunshot wound in the foot; you mainly need them because of visual noise caused by regular nodes.
I believe that more than being verbose (closing tags for example) it is that the whole spec is enormous, with entities and namespaces making it even more complex. Still we see that some of that is actually needed as various json path/schema projects show.
mapping a 1-1 s-expression translation on HTML/XML/JSON/YAML etc. solves nothing. YAML has (had) code execution security problem and parser incompatibility issues. HTML/XML actually need namespaces in a few cases. JSON is abused as "compile target".
There can be no one format that works for everything. This is why I like TOML, it is really good at what it tries to do and stops there.
You may be right, some JSON parsers even support comments. But you never know which parser is strict and which not, because it's not part of the specification.
XML has other problems though.