You should post this to Show HN. Also you have a typo on your README ("characgte...

jart · on Sept 28, 2021

The one built-in to Python will get you most of the way there:

    >>> import unicodedata
    >>> unicodedata.normalize('NFKD', '𝓗℮𝐥1೦𝗵𝗲𝗹𝗹𝗼')
    'H℮l1೦hello'

Obviously it isn't going to remap leetspeak characters like 1 -> l but it covers a lot of cases.

zerocrates · on Sept 29, 2021

Obviously you're saying it doesn't cover everything, but a big thing it's not going to catch beyond leetspeak-type situations is the kinds of thing you (used to) see in internationalized domain spoofing: legitimate non-Latin-script letters that just look the same or nearly the same.

NFKC/NFKD will handle "this is another form of the Latin letter A" type stuff but not "Cyrillic A looks like Latin A."

wanderingstan · on Sept 28, 2021

Thanks, I've fixed the typo! It was such a simple project, hardly seems worthy of a "Show HN".

jonplackett · on Sept 28, 2021

I've seen crazier things get to #1

8note · on Sept 28, 2021

Test for the library: would it catch that that typo still refers to characters?