Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unrelated: In Unicode that would be a pretty difficult problem. The uppercase version of 'ß' is actually two characters "SS". There are a lot of messed up casing rules in the unicode standard. http://unicode.org/faq/casemap_charprop.html#11


Sure, casing is complicated; but determining whether a codepoint has the uppercase property is easy - just use u_isUUppercase().

Where it does get complicated is determining whether a codepoint or a digraph of multiple codepoints is a letter, and that's culture-dependent; e.g. IJ in Dutch.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: