Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Before Unicode, most systems were effectively "byte-transparent" and encoding only a top-level concern. Those working in one language would use the appropriate encoding (likely CP1252 for most Latin languages) and there wouldn't be confusion about different bytes for same-looking characters.


A single user system, perhaps.

I've worked on a system that … well, didn't predate Unicode, but was sort of near the leading edge of it and was multi-system.

The database columns containing text were all byte arrays. And because the client (a Windows tool, but honestly Linux isn't any better off here) just took a LPCSTR or whatever, it they bytes were just in whatever locale the client was. But that was recorded nowhere, and of course, all the rows were in different locales.

I think that would be far more common, today, if Unicode had never come along.


My understanding is way back in the day, people would use ascii backspace to combine an ascii letter with an ascii accent character.


ASCII 1967 (and the equivalent ECMA-6) suggested this, and that the characters ,"'`~ could be shaped to look like a cedilla, diaeresis, acute accent, grave accent, and raised tilde respectively for that purpose. But I've never once seen or heard of that method used.

ASCII also allowed the characters @[\]^{|}~ to be replaced by others in ‘national character allocations’, and this was commonly used in the 7-bit ASCII era.

In the 8-bit days, for alphabetic scripts, typically the range 0xA0–0xFF would represent a block of characters (e.g. an ISO 8859¹ range) selected by convention or explicitly by ISO 2022². (There were also pre-standard similar methods like DEC NRCS and IBM's EBCDIC code pages.)

¹ https://en.wikipedia.org/wiki/ISO/IEC_8859

¹ https://en.wikipedia.org/wiki/ISO/IEC_2022


Googling i saw people link to http://git.savannah.gnu.org/cgit/bash.git/tree/doc/bash.0 as an example of overstriking (albeit for bold not accents). The telnet rfc also makes reference to it. I also see lots of references in the context of APL.

I suppose in the 60s/70s it would be in the era of teletypewriters where maybe over striking would more naturally be a thing.

I also found references to less supporting this sort of thing, but seems to be about bold and underline, not accents.


nroff did do overstriking for underlining and bold. I don't remember if it did so for accents, but in any case it was for printer output and not plain text itself.

APL did use overstriking extensively, and there were video terminals that knew how to compose overstruck APL characters.


SHIFT-JIS and EUC would like a word.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: