I originally laughed at "in five minutes", but even though I do not think the ar...

I originally laughed at "in five minutes", but even though I do not think the article reads in five minutes, it does a surprisingly good job of covering the basics: so good job!

I do wonder if it is clear for people who are unfamiliar with Unicode? Anyone who is mostly unfamiliar with the details article covers who can say how comprehensible the article is?

I would also add a mention of the standard Unicode collation table that does a passable job for many languages at the same time (though Unicode Collation Algorithm is mentioned, which this is the default for, I think it's worth highlighting this property of most UCA implementations).

As for the article gotchas, multilingual text is even more complex when go past 5 minutes even for "simple" European scripts. Eg. in Bosnian/Croatian/Serbian in Roman/Latin alphabet, "nj" will be capitalized to "Nj" or "NJ" depending on the rest of the word — eg. "Njegoš" or "NJEGOŠ"; confusingly, Unicode also includes digraphs for both capitalization forms (the eternal tension in Unicode between encoding letters, glyphs or characters), even though they are linguistically equivalent — in practice, they are never used, which makes their inclusion even more perplexing (they are always spelled out using two characters, and there was no historical reason since none of the 8-bit encodings had them)! It will also sometimes be two distinct letters, especially in loanwords like "konjugovan" — this makes things harder when you need to collate texts since the proper order would be "konjugovan", "kontakt", "konj".

All of this is why I like to joke how Cyrillic script is technically much better for all of these languages, even though it is basically in official use only for the Serbian language — in Cyrillic, there is no conundrum in either of the above examples since nj=њ (or нј), Nj/NJ=Њ, and the order is clear: конјугован, контакт, коњ.