Fun stuff. But this reminds me - what's the the state of Unicode URL security? Not my field but curious.
Eg, 2005:
"Here's a demo: it's a Web page that appears to be www.paypal.com but is not PayPal. Everything from the address bar to the hover-over status on the link says www.paypal.com It works by substituting a Unicode character for the second "a" in PayPal. That Unicode character happens to look like an English "a," but it's not an "a." The attack works even under SSL."
> .eu IDN domain names can consist of Latin, Greek and Cyrillic characters used in any of the official languages of the European Union. Many homoglyphs contain characters from different scripts. A .eu IDN domain name can consist of characters based on one of the three scripts. Therefore, many domain names that would otherwise be confusingly similar are prevented from being registered.
That does remove a large class of attacks, but there are still IDN homograph attacks possible with single-script domains, if they happen to use only characters that can be approximated in another script without mixing scripts. For example, if you registered να.eu ('να!' is a Greek interjection), you could be spoofed by the Latin va.eu.
ICANN's approach goes a step further and disallows registering a new domain if it is confusingly similar to an already registered domain. To allow that to be checked automatically, they use a set of quasi-equivalent pairs (e.g. Greek 'α' <-> Latin 'a') to determine what counts as confusing similar.
Let's see if I can knock together "paypal" out of Cyrillic...
раура
The hard part is the "L," which you can sort-of fake with a 1.
раура1.bg
The "one-script" idea is better than nothing, but it's far from complete. For any two European scripts, the overlap of shared characters is substantial. Some domains will have good protection, and others nothing at all.
[edit] Oh, here's a better one:
раур.al
This one abuses the Albanian top-level domain to look like a cute shortened version. Albania has a small population of native Russian- and Bulgarian-speakers who would have legitimate cause for using a Cyrillic-script domain.
Well those are all similar looking, not looking exactly the same just with different Unicode code points. I don’t think it’s registrar’s job to protect users from all possible phishing attacks.
It took me forever to figure out what was going on in that post, because in the font Firefox displays the linked page in, '✪' renders like an 'o', so it looked like he was talking about the rather unremarkable 'odf.ws'.
This is an old argument I've had for years: the address bar should show unicode characters in a distinctive manner, such as with a different background color. A bright red background would draw the eye.
It is probably worth noting the latest version of the IDN protocol (RFC 5891) makes this URL illegal. It will only work with implementations that use the earlier deprecated version (RFC 3490).
Eg, 2005: "Here's a demo: it's a Web page that appears to be www.paypal.com but is not PayPal. Everything from the address bar to the hover-over status on the link says www.paypal.com It works by substituting a Unicode character for the second "a" in PayPal. That Unicode character happens to look like an English "a," but it's not an "a." The attack works even under SSL."
http://www.schneier.com/blog/archives/2005/02/unicode_url_ha...