Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

yeah I think this is an area where rust (and python) just get it wrong. files, the Internet and input devices can all give you invalid Unicode. IMO it's better to have a primary string type that includes invalid Unicode since most algorithms will handle it correctly anyway, and the ones that won't can pretty clearly check and throw errors appropriately (especially since very few algorithms work correctly for all of Unicode in the first place)



There's a primary type that holds invalid utf8; it's [u8]. If you want a string from that, you try turn it into a string, and deal with the errors then.

If an algorithm works for invalid Unicode, it should probably be an algorithm on bytes, not strings.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: