Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
http://www.★.com (★.com)
24 points by Moto7451 on May 3, 2013 | hide | past | favorite | 24 comments


Fun stuff. But this reminds me - what's the the state of Unicode URL security? Not my field but curious.

Eg, 2005: "Here's a demo: it's a Web page that appears to be www.paypal.com but is not PayPal. Everything from the address bar to the hover-over status on the link says www.paypal.com It works by substituting a Unicode character for the second "a" in PayPal. That Unicode character happens to look like an English "a," but it's not an "a." The attack works even under SSL."

http://www.schneier.com/blog/archives/2005/02/unicode_url_ha...


EURid doesn’t allow mixing scripts:

> Why can’t I mix scripts in my IDN?

> .eu IDN domain names can consist of Latin, Greek and Cyrillic characters used in any of the official languages of the European Union. Many homoglyphs contain characters from different scripts. A .eu IDN domain name can consist of characters based on one of the three scripts. Therefore, many domain names that would otherwise be confusingly similar are prevented from being registered.

http://www.eurid.eu/en/faq

I assume other registrars have similar restrictions.


That does remove a large class of attacks, but there are still IDN homograph attacks possible with single-script domains, if they happen to use only characters that can be approximated in another script without mixing scripts. For example, if you registered να.eu ('να!' is a Greek interjection), you could be spoofed by the Latin va.eu.

ICANN's approach goes a step further and disallows registering a new domain if it is confusingly similar to an already registered domain. To allow that to be checked automatically, they use a set of quasi-equivalent pairs (e.g. Greek 'α' <-> Latin 'a') to determine what counts as confusing similar.


Let's see if I can knock together "paypal" out of Cyrillic...

  раура
The hard part is the "L," which you can sort-of fake with a 1.

  раура1.bg
The "one-script" idea is better than nothing, but it's far from complete. For any two European scripts, the overlap of shared characters is substantial. Some domains will have good protection, and others nothing at all.

[edit] Oh, here's a better one:

  раур.al
This one abuses the Albanian top-level domain to look like a cute shortened version. Albania has a small population of native Russian- and Bulgarian-speakers who would have legitimate cause for using a Cyrillic-script domain.


Well those are all similar looking, not looking exactly the same just with different Unicode code points. I don’t think it’s registrar’s job to protect users from all possible phishing attacks.

Especially since https://secure-paypal-processing-eu3.net would work just as well for most scamming purposes.


Indeed, especially when hack jobs like this work: http://www.macrumors.com/2013/05/01/new-apple-id-phishing-ef...




Gruber explained long ago why using these URLs is a bad idea: http://daringfireball.net/2010/09/i_give_up.


It took me forever to figure out what was going on in that post, because in the font Firefox displays the linked page in, '✪' renders like an 'o', so it looked like he was talking about the rather unremarkable 'odf.ws'.


Interesting, using firefox, I see the star glyph not an o on the page and on this page.


This is an old argument I've had for years: the address bar should show unicode characters in a distinctive manner, such as with a different background color. A bright red background would draw the eye.


It is probably worth noting the latest version of the IDN protocol (RFC 5891) makes this URL illegal. It will only work with implementations that use the earlier deprecated version (RFC 3490).


What is "xn--p3h" and how do you convert to it?

Another I've seen http://⚡.la


http://en.wikipedia.org/wiki/Internationalized_domain_name#T... describes the process behind the Punycode used in internationalized domain names (IDNs).


#!/usr/bin/env python

print "xn--p3h".decode("idna").encode("utf8")



https://www.google.com/search?q=%E2%98%85&oq=%E2%98%85&#...

needs to improve it's seo... or maybe google could start indexing better?


And, it's squatted. Well, it's good to know that there's a whole new world of Unicode domain names out there just waiting to be misused.

I would have loved to see this go to the IAU http://iau.org or similar organization.


From the address bar, Chrome shows http://www.xn--p3h.com/ Safari & Firefox show http://www.★.com


I got http://◳.com/ when I noticed these inexplicably-valid Unicode ranges in the IDN spec. But that seems to be on its way to being fixed, sadly.


This is HN: http://➡.ws/hnhnhn

You could make your own: http://tinyarrows.com/


http://🍎.tk is another fun one.


So how exactly does someone register a domain name like this?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: