Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

At minimum every software developer must also know about normalization, and must know what pre-composed and decomposed forms are (especially normal forms, NFC and NFD).

These things are things one can mostly ignore, until one can't.

Most input modes produce pre-composed (but not necessarily NFC) output. But some things will decompose (e.g., HFS+ will decompose filenames). So... if you cut-n-paste non-ASCII Unicode from an HFS+ file picker UI... you'll get into trouble if the software you paste into is unaware of these things.

Ultimately, every software developer needs:

- UTF validator (at least for UTF-8)

- UTF converters (unless only supporting UTF-8)

- case mapping (probably)

- normalization (almost certainly)

- collation (probably)

That's... not too bad.

Networking software )may_ also need:

- IDNA2008 implementation

- UTS#46 implementation

Word processing / typesetting software also absolutely needs to know about grapheme clusters in order to determine the size of each grapheme. Also: modern fonts.



So you're just basically listing everything you know about Unicode and then state that every software developer MUST know the same. That's bullshit of course.


WAT? I gave a breakdown of when you might need support for specific aspects of Unicode. That is not an exhaustive list, just a list that will get 95% of developers covered in most cases. I didn't mention bi-di, for example, though I probably should have.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: