Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

PEP-393 is a stupid compromise. They couldn't choose between UCS-2 and UCS-4, so they are using both. They are wasting tons of CPU cycles converting between them and single character outside of range doubles the size of string.

I don't fully understand the use case for extracting codepoints from strings, but they could have just added Java-like: codePoints and keep returning code units from old methods. This is CPU and memory efficient and 100% backwards compatible.

I think the problem is the same could have been done in Python 2 (with UTF-8) that would mean less reasons for Python 3.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: