This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 25253 - IRI issue in Blink/Webkit: href='#%C3%A5' targets both id='%C3%A5' and id="å"
Summary: IRI issue in Blink/Webkit: href='#%C3%A5' targets both id='%C3%A5' and id="å"
Status: RESOLVED NEEDSINFO
Alias: None
Product: WHATWG
Classification: Unclassified
Component: URL (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: Unsorted
Assignee: Anne
QA Contact: sideshowbarker+urlspec
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-04 03:45 UTC by Leif Halvard Silli
Modified: 2014-05-22 10:27 UTC (History)
2 users (show)

See Also:


Attachments
Test cases: A file with 3 sets of tests. (9.19 KB, text/html)
2014-04-04 03:45 UTC, Leif Halvard Silli
Details
Test cases: A file with 3 sets of tests. (9.38 KB, text/html)
2014-04-04 03:52 UTC, Leif Halvard Silli
Details
Test cases: A file with 3 sets of tests. (9.83 KB, text/html)
2014-04-04 10:47 UTC, Leif Halvard Silli
Details
Test cases: A file with 4 sets of tests. (12.90 KB, text/html)
2014-04-04 12:15 UTC, Leif Halvard Silli
Details
Test cases: A file with 6 sets of tests. (16.20 KB, text/html)
2014-04-04 13:05 UTC, Leif Halvard Silli
Details
Test cases: A file with 6 sets of tests. (16.30 KB, text/html)
2014-04-04 19:45 UTC, Leif Halvard Silli
Details

Description Leif Halvard Silli 2014-04-04 03:45:15 UTC
Created attachment 1463 [details]
Test cases: A file with 3 sets of tests.

Problem: 

1 For percent-encoded bytes in an activated link, some browsers try a
  literal match for the raw, percent-encoded string before they try a
  ”semantic” match for the string of decoded characters.
2 When this happens, the URL can target two different destinations - 
  e.g. an idref that matches the raw percent-encoded string plus an 
  idref that matches the decoded characters. If the raw string exists 
  in an idref, it will ”win” over the an idref with the decoded string.
3 The URL spec clarify whether testing for a literal match before 
  before testing for a decoded match is permitted.

Affected browsers:

          (*Not* affected: Firefox and IE11.)

* Blink:  problem manifests itself only if author uses percent-encoded
          strings as URL value (e.g. href='#%C3%A5' or href="#%61").
* Webkit: problem manifests itself even for directly typed URLs when the
          code points are higher than U+009F (in which case the coded 
          points are converted to percent-encoded bytes by the URL
          parser). Affects href="å" + also href="å" href="å"

Example - how to replicate:

1) To a document with this fragment, <div  id="å">Fragment 1.</div>
2) add the following two links,
    a) <a href='#å'     >Link A.</a> - directly typed;
    b) <a href='#%C3%A5'>Link B.</a> - percentage encoded;
3) and activate Link A + Link B to verify that each targets Fragment 1
4) Now, add yet another fragment: <div  id='%C3%A5'>Fragment 2.</div>
4) Activate Link A and Link B, and check which fragment is targeted:

   Expected results: 
       * Both links should target Fragment 1. 
       * No link should target Fragment 2.

   Actual results:
       * IE11/Firefox: Both links targets Fragment 1.
       * Blink:        Link B targets Fragment 2.
       * Webkit:       Both links targets Fragment 2.
Comment 1 Leif Halvard Silli 2014-04-04 03:52:55 UTC
Created attachment 1464 [details]
Test cases: A file with 3 sets of tests.
Comment 2 Leif Halvard Silli 2014-04-04 10:47:00 UTC
Created attachment 1465 [details]
Test cases: A file with 3 sets of tests.

Corrected test data about Safari. Conclusion remains the same.
Comment 3 Leif Halvard Silli 2014-04-04 12:15:44 UTC
Created attachment 1466 [details]
Test cases: A file with 4 sets of tests.

Added an extra set of tests to the test cases file.

The new set shows that IE11’s main advantage over Blink and Webkit is that it looks for a match for the ”semantic”, UTF-8 decoded string **before** it looks for a literal match.
Comment 4 Leif Halvard Silli 2014-04-04 13:05:49 UTC
Created attachment 1467 [details]
Test cases: A file with 6 sets of tests.

Added 2 more test sets.
Comment 5 Leif Halvard Silli 2014-04-04 19:45:39 UTC
Created attachment 1468 [details]
Test cases: A file with 6 sets of tests.

Fixed an error in the second set of test.
Comment 6 Anne 2014-04-15 17:29:58 UTC
This is already defined in HTML. All the URL Standard should do is define parsing a URL and exposing a fragment component HTML can use to perform its further processing on.

Since I'm in a good mood, you can find that algorithm here: http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#scroll-to-fragid

Is there anything you think needs changing still?