I'm not convinced I should be coding my own UTF-8 validation in the year 2020, unless I'm working on the run-time of a programming language.
I expect the I/O facility of a modern programming language to either reject bad UTF-8 or "quarantine" it somehow (map the bad bytes to a safe representation according to some documented rules).
I expect the I/O facility of a modern programming language to either reject bad UTF-8 or "quarantine" it somehow (map the bad bytes to a safe representation according to some documented rules).