These tests check whether a user agent displays IDNs (Internationalized Domain Names) as Unicode or punycode in the status bar. User agents that try to detect possible homograph attacks do so in different ways. These tests explore some of those approaches. They are not exhaustive, and the results may change over time, since there is no standard for how to proceed in this respect, and some of the tests are based on lists that may change.
For more information about what to expect see the article An Introduction to Multilingual Web Addresses.
See the results below for user agents tested. This section summarizes the results of those tests.
IE7 doesn't care about the TLD in the IDN, but it does prevent mixing of scripts in most cases. ASCII can be mixed with other scripts, but not with Cyrillic or Greek. Japanese kanji and hiragana and ASCII can be mixed.
IE7 does, however, produce different behaviour depending on which languages are declared in the browser preferences. The others do not. This means that if you are dealing with a language that is not defined in IE7's selection list, you will only see punycode.
Firefox allows any combination of characters, provided that the TLD is on their approved list.
Opera also allows any combination of characters for TLDs on its whitelist. For TLDs not on the whitelist, the situation is not always clear. From the tests here it seems that any combination of characters is also allowed for many TLDs not on the whitelist - not just Latin1 characters as stated in Opera's description. The exception is that combinations of Greek or Cyrillic with Latin characters are displayed as punycode if the TLD isn't on the whitelist. On the other hand, Devanagari is displayed as punycode if the TLD is not whitelisted, unless it is combined with ASCII or accented Latin characters (which seems odd).
Safari displays any IDN containing only characters from one or more scripts in the whitelist as Unicode, and any other IDN as punycode.
The famous pаypal.com IDN, which has a cyrillic a after the first p, is displayed as punycode by IE7 because it mixes disallowed scripts, by Firefox because .com is not on Firefox's list of supported TLDs, and by Safari because Cyrillic is not on the default whitelist of scripts. Opera, however, displays this as Unicode, since .com is on its whitelist.
The following user agents were tested on Windows XP.
When Internet Explorer had only English set in the browser preferences, all of the tests produced punycode in the status bar. For the results below the following languages were set in the preferences: Russian, Japanese, German, Greek, Hindi.
At the time the tests were run, Firefox's list and the Opera whitelist in opera6.ini were configured as follows:
The columns represent user agents tested on a given version and date. The cells contain P if the test to the left produced punycode. The notes below the tables attempt explanations of certain aspects of the tests. For Firefox and Opera, color-coding is used to indicate which TLDs are in their whitelist (green) and which are not (red). For Safari, the same colours are use to indicate whether the IDN contains characters from a script not in the whitelist (red) or not (green).
IE7 | Firefox | Opera | Safari | |
---|---|---|---|---|
2.0.0.3 | 9.10 | 2.0.1 | ||
20070323 | 20070323 | 20070323 | 20070324 | |
charþ.is | - | - | - | - |
charő.hu | - | - | - | - |
charþ.hu | - | - | - | - |
charő.is | - | - | - | - |
charþ.com | - | P | - | - |
charő.com | - | P | - | - |
charþ.xy | - | P | - | - |
charő.xy | - | P | - | - |
charþ.fi | - | - | - | - |
charő.fi | - | - | - | - |
Notes:
IE7 | Firefox | Opera | Safari | |
---|---|---|---|---|
2.0.0.3 | 9.10 | 2.0.1 | ||
20070323 | 20070323 | 20070323 | 20070324 | |
кириллица.ru | - | P | - | P |
ελληνικά.gr | - | - | - | P |
漢字.jp | - | - | - | - |
かな.jp | - | - | - | - |
यूनिकोड.in | - | P | P | - |
кириллица.fi | - | - | - | P |
ελληνικά.fi | - | - | - | P |
漢字.fi | - | - | - | - |
यूनिकोड.fi | - | - | P | - |
यूनिकोड.de | - | - | - | - |
Հայերեն.de | - | - | - | - |
Հայերեն.am | - | P | - | - |
ภาษาไทย.th | - | - | - | - |
ภาษาไทย.com | - | P | - | - |
ህሔራዊነት.de | P | - | - | P |
ህሔራዊነት.er | P | P | - | P |
Notes:
IE7 | Firefox | Opera | Safari | |
---|---|---|---|---|
2.0.0.3 | 9.10 | 2.0.1 | ||
20070323 | 20070323 | 20070323 | 20070324 | |
кириллицаascii.ru | P | P | P | P |
ελληνικάascii.gr | P | - | P | P |
漢字ascii.jp | - | - | - | - |
かなascii.jp | - | - | - | - |
यूनिकोडascii.in | - | P | - | - |
кириллицаascii.de | P | - | - | P |
ελληνικάascii.de | P | - | - | P |
漢字ascii.de | - | - | - | - |
かなascii.de | - | - | - | - |
यूनिकोडascii.de | - | - | - | - |
кириллицchará.ru | P | P | P | P |
ελληνικάchará.gr | P | - | P | P |
漢字chará.jp | P | - | - | - |
かなchará.jp | P | - | - | - |
यूनिकोडchará.in | P | P | - | - |
кириллицchará.de | P | - | - | P |
ελληνικάchará.de | P | - | - | P |
漢字chará.de | P | - | - | - |
かなchará.de | P | - | - | - |
यूनिकोडchará.de | P | - | - | - |
pаypal.com | P | P | - | P |
Notes:
IE7 | Firefox | Opera | Safari | |
---|---|---|---|---|
2.0.0.3 | 9.10 | 2.0.1 | ||
20070323 | 20070323 | 20070323 | 20070324 | |
漢字かな.jp | - | - | - | - |
漢字かな.de | - | - | - | - |
漢字かな.ru | - | P | - | - |
漢字かな.in | - | P | - | - |
漢字かなascii.jp | - | - | - | - |
漢字かなchará.jp | P | - | - | - |
Notes:
IE7 | Firefox | Opera | Safari | |
---|---|---|---|---|
2.0.0.3 | 9.10 | 2.0.1 | ||
20070323 | 20070323 | 20070323 | 20070324 | |
кириллица漢字.ru | P | p | P | P |
кириллица漢字.jp | P | - | - | P |
यूनिकोड漢字.in | P | P | P | - |
यूनिकोड漢字.jp | P | - | - | - |
ελληνικά漢字.jp | P | - | - | P |
ελληνικά漢字.gr | P | - | P | P |
Notes:
IE7 | Firefox | Opera | Safari | |
---|---|---|---|---|
2.0.0.3 | 9.10 | 2.0.1 | ||
20070323 | 20070323 | 20070323 | 20070324 | |
example.com⁄foo.museum | P | P | illegal | - |
I♥NY.museum | P | - | - | - |
Notes: