Well, you just need to call unicode_next_character all the time instead of saying s++, similarly for whitespace, similarly for asking whether a character can initiate or continue an identifier, etc. It does not change the basic nature of the task at all.
Sure, but if you are using a language without support for unicode, and you don't use a dedicated library (which would be already using a kind of lexer, wouldn't it?), you have also to parse these unicode characters yourself.
A unicode_next_character function is very simple to write regardless of unicode support in your language.
I usually write it as a small (256 byte) lookup table where each entry tells you how many characters to skip next. If you don't use a lookup table, its 4 single-line `if` statements (and if you do it's a one liner, plus however many lines the table takes).