Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

UTF-encode the maliscious snippet and disguise it as the regex pattern haha


Doesn't UTF-8 just give you the exact same letters, since it's already ASCII?


I think it was meant to be UTF-8 escape codes, as in \u0000 etc.


Pedantry: These are really Unicode escape codes not UTF-8 escape codes, since they're codes for Unicode code points. You can write invalid UTF-16 with these escapes (because unlike UTF-8 the UTF-16 encoding actually borrows Unicode code points that are not scalar values, called the "surrogates") but I don't know what the standard says should happen if you do that.


Ahh that makes sense, thanks.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: