In order to maintain backwards compatibility with existing documents, the first ...

jcranmer · on May 29, 2017

> (Though maybe this means you could convert latin1 to utf-16 by interleaving null bytes with the latin1 bytes?)

Yes. In fact, things like JS JITs end up storing strings as either UTF-16 strings or Latin1 strings internally to take advantage of this fact.

wereHamster · on May 29, 2017

JavaScript uses (used, until a recent version) UCS-2, not UTF-16!

ygra · on May 29, 2017

Most JavaScript implementations have a bunch of different string types used internally, depending on what you're doing with the string. In-memory representation has no bearing on the API visible to the outside world.

And while the JavaScript APIs only allow you to deal with UCS-2, the string contents themselves are, in fact, usually UTF-16.

the8472 · on May 29, 2017

It is useful for hacks in languages or APIs that don't distinguish between uint8[] and unicode string types. When you need to handle binary data you can create a string with unicode codepoints 0-255 and pass it to various IO things as latin-1 to create the byte sequences you want.

And of course it also works in the other direction. To safely read binary data into unicode strings just decode as latin-1 instead of utf8 and you won't run into validation errors since all byte sequences are valid in latin-1 while not all are in utf8.