Bzzt! Wrong! I have worked with ASN.1 for many years, and I love ASN.1. :)
Really, I do.
In particular I like:
- that ASN.1 is generic, not specific to a given encoding rules (compare to XDR, which is both a syntax and a codec specification)
- that ASN.1 lets you get quite formal if you want to in your specifications
For example, RFC 5280 is the base PKIX spec, and if you look at RFCs 5911 and 5912 you'll see the same types (and those of other PKIX-related RFCs) with more formalisms. I use those formalisms in the ASN.1 tooling I maintain to implement a recursive, one-shot codec for certificates in all their glory.
- that ASN.1 has been through the whole evolution of "hey, TLV rules are all you need and you get extensibility for free!!1!" through "oh no, no that's not quite right is it" through "we should add extensibility functionality" and "hmm, tags should not really have to appear in modules, so let's add AUTOMATIC tagging" and "well, let's support lots of encoding rules, like non-TLV binary ones (PER, OER) and XML and JSON!".
Protocol Buffers is still stuck on TLV, all done badly by comparison to BER/DER.
Yeah I know I'm making fun of it a lot (mostly in jest) but it genuinely is a really interesting specification, and it's definitely sad - but not surprising - it's not a very popular choice outside of its few niche areas.
:) Glad to see someone else who's gone down this road as well.
I feel the experience of many people writing with ASN.1 is that of dealing with PKI or telecom protocols, which attempt to build worldwide interop between actually very different systems. The spec is one thing, but implementing it by the book is not sufficient to get something actually interoperable, there are a ton of quirks to work around.
If it was being used in homogenous environments the way protocol buffers typically are, where the schemas are usually more reasonable and both read and write side are owned by the same entity, it might not have gotten such a bad rap...
How do you feel about something like CBOR? In which stage would you say it's stuck in evolution compared to ASN.1 (since you said Protobuf is still TLV)?
CBOR and JSON are just encodings, not schema, though there are schemas for them. I've not looked at their schema languages but I doubt they support typed hole formalisms (though they could be added as it's just schema). And since CBOR and JSON are just encodings, they are stuck being what they are -- new encodings will have compatibility problems. For example, CBOR is mostly just like JSON but with a few new types, but then things like jq have to evolve too or else those new types are not really usable. Whereas ASN.1 has much more freedom to introduce new types and new encoding rules because ASN.1 is schema and just because you introduce a new type doesn't mean that existing code has to accept it since you will evolve _protocols_. But to be fair JSON is incredibly useful sans schema, while ASN.1 is really not useful at all if you want to avoid defining modules (schemas).
I was considering CBOR+CDDL heavily for a project a while so they're a tad intertwined in my head. I very much liked CBOR's capability of being able to define wholly new types and describe them neatly in CDDL. You could even add some basic value constraints (less than, greater equal, etc.). That seemed really powerful and lacking ASN.1 experience it sounds like a very lite JSON-like subset of that.
I worked with ASN.1 for a few years in the embedded space because its used for communications between aircraft and air traffic control in Europe [1]. I enjoyed it. BER encoding is pretty much the tightest way to represent messages on the wire and when you're charged per-bit for messaging, it all adds up. When a messaging syntax is defined in ASN.1 in an international standard (ICAO 9880 anyone?), its going to be around for a while. Haven't been able to get my current company to adopt ASN.1 to replace our existing homegrown serialization format.
As a former PKI enthusiast (tongue firmly in cheek with that description) I can say if you can limit your exposure to simply issuing certs so you control the data and thus avoid all edge cases, quirks, non-canonical encodings, etc, dealing with ASN.1 is “not too terrible.” But it is bad. The thing that used to regularly amaze me was the insane depths of complexity the designers went to … back in the 70’s! It is astounding to me that they managed to make a system that encapsulated so much complexity and is still in everyday use today.
ASN.1 is from the mid-80s, and PKI is from the late 80s.
The problems with PKI/PKIX all go back to terrible, awful, no good, very bad ideas about naming that people in the OSI/European world had in the 80s -- the whole x.400/x.500 naming style where they expected people to use something like street addresses as digital names. DNS already existed, but it seems almost like those folks didn't get the memo, or didn't like it.
They got grant money to work on anything but TCP/IP. :-) A lot of European oral history about how "the Internet" got to a Uni talks about how they were supposed to only use ISO/OSI but eventually unofficially installed IP anyway.
There's the other story of corporate vendors saying "yes, we will implement OSI, give us X time but buy our product now and we will deliver OSI" then actually going "we mangled BSD Sockets enough to work if you squint enough, let's try to wait the client off while racking the profit"
Unless you need things like ability to address groups in flexible ways, which is why X.400 survives in various places (in addition to actually supporting inline cryptography and binary attachments).
What people forget is that you do not have to use the whole set of schema attributes.
Does Internet email not support binary attachments? Of course it does.
And encrypted and/or signed email? That too, though very poorly, but the issue there is key management, and DAP/LDAP don't help because in the age of spam public directories are not a thing. Right now the best option for cryptographic security for email is hop-bby-hop encryption using DANE for authentication in SMTP, with headers for requesting this, and headers for indicating whether received email transited with cryptographic protection all the way from sender to recipient.
As for the "ability to address groups in flexible ways", I'm not sure what that means, but I've never see group distribution lists not be sufficient.
And how long did it take for binary attachments to be reliable, encodings unfucked, etc?
As for group addressing, distribution lists are pitiful in comparison especially on discovery side.
Anyway, ultimately the big issue is that the DAP schema is always presented as "oh you need all the details", when... you don't. And we never got to point of really implementing things well outside the more expected use case where people do not, actually, use them directly but pick by name/function from directory.
> And how long did it take for binary attachments to be reliable, encodings unfucked, etc?
Oh I can't remember. Binary attachments have worked since I started using them long long ago. It worked at least in the mind-90s. Back then I was using both, Internet email and x.400 (HP OpenMail!), and x.400 was a massive pain (for me especially since I was one of the people who maintained a gateway between the two). I know what you're referring to: it took a long time for email to get "8-bit clean" / MIME because of the way SMTP works, but MIME was very much a thing by the mid-90s.
So it took a while if you count the days of UUCP email -- round it to two decades. But "by the md-90s" was plenty good enough because that's when the Internet revolution hit big companies. Lack of binary attachments wasn't something that held back Internet adoption. As far as the public and corps are concerned the Internet only became a thing circa 1994 anyways.
> As for group addressing, distribution lists are pitiful in comparison especially on discovery side.
Discovery, meaning directories. Those are nice inside corporate networks, which is where you need this functionality, so I agree, and yes people use Exchange / Exchange 365 / Outlook for this sort of thing, though even mutt can do LDAP-based discovery (poorly, but yes). Outside corporate networks directories are only useful within academia and governments / government labs. Outside all of that no one wants directories because they would only encourage the spammers.
Binary attachments mostly started to work with less surprises by second half of 1990s, but 8bit unclean issues persisted in my experience... I wanted to say 2001, but I recalled getting hit by them until 2010 at least.
And in some ways desire of people to send non-7bit-ascii text as email is also a continued brokenness in SMTP email.
As for directories - my point was more that directories hid the addressing details from surface UI. Otherwise AFAIK X.400 works perfectly fine without using everything in the possible schema.
Fun fact - Exchange is actually X.400 system, despite no longer having non-SMTP connection options. But its internals are even wonkier, like Exchange and Outlook not supporting HTML Email (no, really, MAPI.DLL crashes on HTML email. If you send/receive HTML email it's stored as HTML wrapped in RTF, and unwrapped when sent elsewhere)
No we're not. We're using dNSName subjectAlternativeName values. We used to use the CN attribute of the subject DN, and... there is still code for that, but it's obsolete.
We _are_ using subject DNs for linking certs to their issuers, but though that's "free-form", we don't parse them, we only check for equality.
SANs are not free-form. A dNSName SAN is supposed to have an FQDN. An rfc822Name SAN is supposed to carry an email address. And, ok, sure, email addresses' mailbox part is basically free-form, but so what, you don't interpret that part unless you've accepted that certificate for that email address' domain part, and then you interpret the mailbox part the way a mail server would because you're probably the mail server. Yes, you can have directoryName SANs, but the whole point of SANs is that DNs suck because x.400/x.500 naming sucks so we want to use something that isn't that.
There was an amusing chain of comments the last time protobuf was mentionned in which some people were arguing that it had been a terrible idea and ASN.1, as a standard, should have been used.
It was hilarious because clearly none of the people who were in favor had ever used ASN.1.
Cryptonector[1] maintains an ASN.1 implementation[2] and usually has good things to say about the language and its specs. (Kind of surprised not he’s not in the comments here already :) )
Thanks for the shout-out! Yes, I do have nice things to say about ASN.1. It's all the others that mostly suck, with a few exceptions like XDR and DCE/Microsoft RPC's IDL.
Derail accepted! Is your approval of DCE based only on the serialization not being TLV or on something else too? I have to say, while I do think its IDL is tasteful, its only real distinguishing feature is AFAICT the array-passing/returning stuff, and that feels much too specialized to make sense of in anything but C (or largely-isomorphic low-level languages like vernacular varieties of Pascal).
Well, I do disapprove of the RPC-centric nature of both, XDR and DCE RPC, and I disapprove of the emphasis on "pointers" and -in the case of DCE- support for circular data structures and such. The 1980s penchant for "look ma'! I can have local things that are remote and you can't tell because I'm pretending that latency isn't part of the API hahahaha" research really shows in these. But yeah, at least they ain't TLV encodings, and the syntax is alright.
I especially like XDR, though maybe that's because I worked at Sun Microsystems :)
"Pointers" in XDR are really just `OPTIONAL` in ASN.1. Seems so silly to call them pointers. The reason they called them "pointers" is that that's how they represented optionality in the generated structures and code: if the field was present on the wire then the pointer is not null, and if it was absent the then pointer is null. And that's exactly what one does in ASN.1 tooling, though maybe with a host language Optional<> type rather than with pointers and null values. Whereas in hand-coded ASN.1 codecs one does sometimes see special values used as if the member had been `DEFAULT` rather than `OPTIONAL`.
You're likely to find my comments among those saying that. I've been using ASN.1 in some way for a couple of decades, and I've been an ASN.1 implementor for about half a decade.
It's not entirely horrible, parsing DER dynamically enough to handle interpreting most common certificates can be done in some 200-300 lines of C#, so I'd take that any day over XML.
The main problem is that to work with the data you need to understand the semantics of the magic object identifiers and while things like the PKIX module can be found easily, the definitions for other more obscure namespaces for extensions can be harder to locate as it's scattered in documentation from various standardization organizations.
So, protobuf could very well have been transported in DER, the problem issue was probably more one of Google not seeing any value of interoperability and wanting to keep it simple (or worse, clashing by oblivious users re-using the wrong less well documented namespaces).
I suspect that typical interactions with ASN.1 are benign because people are interested in reading and writing a few specific preexisting data structures with whatever encoding is required for interoperability, not in designing new message structures and choosing encodings for them.
For example, when I inherited a public key signature system (mainly retrieving certificates and feeding them to cryptographic primitives and downloading and checking certificate revocation lists) everything troublesome was left by dismissed consultants; there were libraries for dealing with ASN.1 and I only had to educate myself about the different messages and their structure, like with any other standard protocol.