URI (%C3%A5
) versus IRI (å
)
Exposing variation in which idrefs a IRI versus a URI is matching in various browsers.
Key question: How do browsers look at an idref that, character-by-character (minus the hash character), matches the characters in the percent encoded representation of that idref?
Results summary:
The Blink behavior indicates that the browser, for percentage encoded URLs, tries a literal match before it tries the 'semantic' UTF-8 based match.
The Webkit behavior indicates that the browser tries a literal match even for directly typed URLs (when the code point i higher than U+009F, as required by the URL spec).
Testing id="å"
and id="%C3%A5".
Table 1: Test URLs
| URL test | Expected idref to be targeted | Actually targeted idref |
id=å | id=%C3%A5 |
IRI as URI: | #%C3%A5 | id=å | Ffox, IE11 | Safari, Chrome |
Explicitly targeting %C3%A5 | #%25C3%25A5 | id=%C3%A5 | | Ffox, IE11, Safari, Chrome |
IRI as named character entity: | #å | id=å | Ffox, IE11, Chrome | Safari |
IRI as decimal character reference: | #ȩ | id=å | Ffox, IE11, Chrome | Safari |
IRI as hexadecimal character reference: | #å | id=å | Ffox, IE11, Chrome | Safari |
IRI as directly typed character: | #å | id=å | Ffox, IE11, Chrome | Safari |
Table 2: Idref targets for test urls in table 1.
Target 1: | Target 2: |
Back to table 1 |
id=å
| |
|
|
|
|
id=%C3%A5
|
Testing only id="æ2
(no test of id="%C3%A6")
Table 3: Test URLs
| URL test | Expected idref to be targeted | Actually targeted idref |
id=å | id=%C3%A5 |
IRI as URI: | #%C3%A6 | id=æ | Ffox, IE11 | Safari, Chrome |
IRI as directly typed character: | #æ | id=æ | Ffox, IE11, Chrome | Safari |
Table 4: Idref targets for test urls in table 3.
Target |
Back to table 3 |
id=æ
|
Testing id="a"
and id="%61".
Table 5: Test URLs
| URL test | Expected idref to be targeted | Actually targeted idref |
id=a | id=%61 |
IRI as URI: | #%61 | id=a | Ffox, IE11 | Safari, Chrome |
Explicitly targeting %C3%A5 | #%2561 | id=%61 | | Ffox, IE11, Safari, Chrome |
IRI as decimal character reference: | #a | id=a | Ffox, IE11, Chrome | Safari |
IRI as hexadecimal character reference: | #a | id=a | Ffox, IE11, Chrome | Safari |
IRI as directly typed character: | #a | id=a | Ffox, IE11, Chrome | Safari |
Table 6: Idref targets for test urls in table 5.
Target 1: | Target 2: |
Back to table 5 |
id=a
| |
|
|
|
|
id=%61
|