Before Unicode, most systems were effectively "byte-transparent" and encoding on...

deathanatos · on March 24, 2024

A single user system, perhaps.

I've worked on a system that … well, didn't predate Unicode, but was sort of near the leading edge of it and was multi-system.

The database columns containing text were all byte arrays. And because the client (a Windows tool, but honestly Linux isn't any better off here) just took a LPCSTR or whatever, it they bytes were just in whatever locale the client was. But that was recorded nowhere, and of course, all the rows were in different locales.

I think that would be far more common, today, if Unicode had never come along.

bawolff · on March 24, 2024

My understanding is way back in the day, people would use ascii backspace to combine an ascii letter with an ascii accent character.

kps · on March 24, 2024

ASCII 1967 (and the equivalent ECMA-6) suggested this, and that the characters ,"'`~ could be shaped to look like a cedilla, diaeresis, acute accent, grave accent, and raised tilde respectively for that purpose. But I've never once seen or heard of that method used.

ASCII also allowed the characters @[\]^{|}~ to be replaced by others in ‘national character allocations’, and this was commonly used in the 7-bit ASCII era.

In the 8-bit days, for alphabetic scripts, typically the range 0xA0–0xFF would represent a block of characters (e.g. an ISO 8859¹ range) selected by convention or explicitly by ISO 2022². (There were also pre-standard similar methods like DEC NRCS and IBM's EBCDIC code pages.)

¹ https://en.wikipedia.org/wiki/ISO/IEC_8859

¹ https://en.wikipedia.org/wiki/ISO/IEC_2022

bawolff · on March 25, 2024

Googling i saw people link to http://git.savannah.gnu.org/cgit/bash.git/tree/doc/bash.0 as an example of overstriking (albeit for bold not accents). The telnet rfc also makes reference to it. I also see lots of references in the context of APL.

I suppose in the 60s/70s it would be in the era of teletypewriters where maybe over striking would more naturally be a thing.

I also found references to less supporting this sort of thing, but seems to be about bold and underline, not accents.

kps · on March 25, 2024

nroff did do overstriking for underlining and bold. I don't remember if it did so for accents, but in any case it was for printer output and not plain text itself.

APL did use overstriking extensively, and there were video terminals that knew how to compose overstruck APL characters.

TheRealPomax · on March 25, 2024

SHIFT-JIS and EUC would like a word.