Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Linux TTY font for Chinese, but treat it as a syllabic writing (github.com/oldherl)
62 points by gslin on March 13, 2024 | hide | past | favorite | 30 comments


Is this like the Chinese version of Mark Twain's spelling reform?

"For example, in Year 1 that useless letter c would be dropped to be replased either by k or s, and likewise x would no longer be part of the alphabet. The only kase in which c would be retained would be the ch formation, which will be dealt with later.

Year 2 might reform w spelling, so that which and one would take the same konsonant, wile Year 3 might well abolish y replasing it with i and Iear 4 might fiks the g/j anomali wonse and for all.

Jenerally, then, the improvement would kontinue iear bai iear with Iear 5 doing awai with useless double konsonants, and Iears 6-12 or so modifaiing vowlz and the rimeining voist and unvoist konsonants.

Bai Iear 15 or sou, it wud fainali bi posibl tu meik ius ov thi ridandant letez c, y and x — bai now jast a memori in the maindz ov ould doderez — tu riplais ch, sh, and th rispektivli.

Fainali, xen, aafte sam 20 iers ov orxogrefkl riform, wi wud hev a lojikl, kohirnt speling in ius xrewawt xe Ingliy-spiking werld."

- Mark Twain


Worth mentioning that a spelling reform like Twain's would mean that every accent would need to be spelled differently, or the sounds wouldn't match for most speakers. It's quite possible that some British dialects would be very hard to read for some Americans and vice versa, even if they were used to the system for their own dialect.

For instance, Twain spells "after" like "aafte", but in my accent I would probably spell it "afdr".

Twain has also run in to the problem that spoken English has far more vowels then written English does. He uses "ai" in both "minds"/"maindz" and "replace"/"riplais", but presumably they represent different sounds.

Presumably OP's system has the same problems with the dialects of Chinese, now that I think about it. Although I know very little about Chinese unfortunately (only what I know now from a few seconds of googling).


My Chinese knowledge is limited, but I believe that words in Chinese would not be affected by accent in the same way. This is because even though, for example, in the Taiwanese accent sh is pronounced like s, "shi" and "si" are still "spelt" differently and consistently when using a phonetic writing system like pinyin or zhuyin. There are still individual words that are pronounced differently in different regions, but that's not related to accent.


An amusing example of this is the character Bascule in Banks' Feersum Endjinn, who narrates like so:

> Woak up. Got dresd. Had brekfast. Spoke wif Egrates the ant who sed itz juss been wurk wurk wurk 4 u lately master Bascule, Y dont u 1/2 a holiday? & I agreed & that woz how we decided we otter go 2 c Mr Zoliparia in thi I-ball ov thi gargoyle Rosbrith.

I personally enjoyed it and never had much trouble figuring it out, but know people who found it unpleasant and incomprehensible.


You jest, but Simplified Chinese came into existence because Traditional Chinese is (and derivatives like Japanese Kanji are) generally regarded as too complicated.

Of course, the degree to which the complexity is disdained varies depending on who you ask: Japanese saw fit to reuse mostly as-is, Hangul chose to replace it wholesale, and so on. Simplfied Chinese is somewhere in-between, not replacing wholesale but simplifying characters often to the point of bastardization.


Japanese do have shinjitai characters which are also simplified, but differently, although there are definitely fewer such characters than in Simplified Chinese.

My favorite is the simplification of 龍 (dragon) - in Simplified Chinese, they took the right hand side and derived 龙; in Japanese Shinjitai, they took the left hand side and obtained 竜.


Chinese speakers would probably think it is 龟 instead loll


I learned Japanese first, and 龟 still regularly trips me up when reading simplified characters.


It wasn’t too complicated. 95% of the country couldn’t read it write. And printing was impossible.


You may want to read up on the history of printing in East Asia: https://en.wikipedia.org/wiki/History_of_printing_in_East_As...


Not too far off from IPA


Year 15 seems particularly grating: English had letters for ch, th, and sh. Should have just started using those again. The kicker being that the single consonant for "ch" was <drumroll> just c.


Had a good laugh reading this as it transformed along, that's great and could make interesting storytelling that would show the passing of time through a changing writing.


You may like the novel Ella Minnow Pea

https://en.m.wikipedia.org/wiki/Ella_Minnow_Pea


For reference, if my counting is correct, Mandarin Chinese has:

  ~1465 syllables considering tone (is there a word for this?):
    ~416 syllables ignoring tone:
      21+1 initials (b, p, m, f, d, t, n, l, g, k, h, j, q, x, zh, ch, sh, r, z, c, s, empty)
      35+1 finals, consisting of a subset of the possible pairs of:
        3+1 medials (empty, i/yi, u/wu, ü/yu; but note that "u" is used instead of "ü" after some initials)
        10+3 rimes (empty, e, ê, a, ei, ai, ou, ao, en, an, eng, ang, er; note however that the spelling often varies based on what initial and medial precede it, in particular this is how "o" appears in several unrelated places):
          ?6+2 combined non-medial vowels (empty, e, ê, a, ei, ai, ou, ao; but this isn't a particularly meaningful thing to count since medials can occur without a main vowel, and there are so many irregularities of the pronunciation and spelling)
          2+2 codas (empty, n, ng, r; note that "m" can also appear as a syllable of its own but is an initial with a vowel that has been lost, unlike "r" which is clearly a bare coda with a practically-absent vowel; syllabic "n" and "ng" are bare codas unless for a character also pronounced bare "m")
    4+1 tones (¯, ´, ˇ, `, and neutral; neutral tone is only allowed for a handful of syllables; fairly often only 2 or 3 tones actually exist for a given syllable)
Most of the weirdest cases are for interjections and core function words - only a few characters, but fairly common in (informal) speech. But note that Pinyin absolutely is a mess of exceptions, not any kind of organized thing, and only some of that can be blamed on the fact that it's trying to model human speech.


> syllables considering tone (is there a word for this?)

Prosodic units [1], which happen to be syllables with tones in Chinese. (In other languages such suprasegmental features can spread throughout multiple syllables, so a single prosodic unit would contain all of them.)

[1] https://en.wikipedia.org/wiki/Prosodic_unit


Can coda 'r' be combined with any initial, medial, or rhyme other than 'er'?

I thought er-hua was only used for 儿 and homonyms.


Man, I thought I was having a stroke reading this. It would frankly be a lot more readable if they'd just used pinyin directly.

As an example, they use 下在 in place of 下载. These are pronounced differently (zài in the former, zǎi in the latter). The former also sort-of means "here is" or "below are" (although nobody would actually write it that way), which means you get something that looks kind of like "below are size: 188.17 MiB" instead of "download size".

I do appreciate the author pointing out that this is just an exercise, because I would never be able to use this in practice, and I'm not even a native speaker.


Author here. Fair point for using pinyin directly, but the TTY font requires the glyphs to be of the same size. Squashing the longest syllables like "chuang" into one cell doesn't look nice.


It looks like each character is taking up twice the width of an ASCII symbol, but half of that is empty space. Why is that? Is that space completely unusable?


> I'm not even a native speaker.

It might actually be easier for non native speakers. It seems a lot of native speakers have a harder time reading pinyin for example (compared to non native speakers).


Technically the official pronunciation of 下载, is xia4zai4, though I've never heard it said that way.


I am not a Chinese speaker/reader but my impression is this assumes Mandarin pronunciation whereas written Chinese typically is mutually intelligible between regional Chinese languages even if they aren't auditorially mutual intelligible. Is that a fair take on this?


You are right. This assumes Mandarin. Given the constraints and so many compromises I've taken, this isn't too bad a stretch. After all, all other dialects in Mainland China are seldom used in written text, especially in the field of technology. Mandarin is the one and only go-to option. I am not judging whether this is good or not, but it is the current situation of Chinese languages.


Kind of like Japanese where dialects can be different enough to be mutually incompatible but are all written using Japanese, and the national dialect/lingua franca is "Hyoujun-go" (literally "Standard Language") based almost entirely off of the Tokyo dialect which itself is a subset of the Kanto dialect.


Written Chinese is usually mandarin.

Chinese is a phonetic language. If you want to write in Shanghainese (as in the language as part of the wu family) your choice of characters for even common word would be different than if you write in Mandarin Chinese.

https://en.m.wikipedia.org/wiki/Shanghainese


I did this for Hangul years ago on an embedded video OSD system to save font ROM space. But this composition of letter into syllables is fundamental to how Hangul works. Once I had rediscovered the old “Johab” concept of making the syllables, it was simple to adapt it for my purpose.

I searched briefly into Chinese, but concluded this technique wouldn’t be possible. But a couple years later while visiting a font artwork exhibit at a museum in Seoul, I saw an example of using this composition technique for Chinese symbols, using combinations of woodblocks. I’ve always wondered if this method had ever been implemented for computers.

Cool Project!


Thanks. It helps to find some interesting phenomena in neuroscience.


Oh god that looks cursed. Have you thought of just representing the characters using Zhuyin?


1. You still have to squeeze three Zhuyin symbols into one cell for syllables like "chuang", due to the limitation of psfu fonts. And no tones can be added.

2. I'm not a Zhuyin native user, so it's better to be done by someone else.

You're welcome to develop a Zhuyin version if you like.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: