<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>11379</bug_id>
          
          <creation_ts>2010-11-22 18:40:56 +0000</creation_ts>
          <short_desc>[pending URL spec] definition of hierarchical URL inconsistent with rfc 3986</short_desc>
          <delta_ts>2012-12-15 10:47:46 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WHATWG</product>
          <component>URL</component>
          <version>unspecified</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>WORKSFORME</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P4</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>Unsorted</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Glenn Adams">glenn</reporter>
          <assigned_to name="Anne">annevk</assigned_to>
          <cc>annevk</cc>
    
    <cc>erik.arvidsson</cc>
    
    <cc>ian</cc>
    
    <cc>mike</cc>
    
    <cc>public-html-admin</cc>
    
    <cc>public-html-wg-issue-tracking</cc>
    
    <cc>public-webapps</cc>
    
    <cc>w3c</cc>
          
          <qa_contact>sideshowbarker+urlspec</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>42701</commentid>
    <comment_count>0</comment_count>
    <who name="Glenn Adams">glenn</who>
    <bug_when>2010-11-22 18:40:56 +0000</bug_when>
    <thetext>Section 2.6.1 defines a hierarchical URL thus:

&quot;An absolute URL is a hierarchical URL if, when resolved and then parsed, there is a character immediately after the &lt;scheme&gt; component and it is a U+002F SOLIDUS character (/).&quot;

However, RFC3986 Section 3 defines all URIs as containing a hierarchical part as follows:

URI         = scheme &quot;:&quot; hier-part [ &quot;?&quot; query ] [ &quot;#&quot; fragment ]

and, further, does not require the hierarchical part to start with &quot;/&quot;. In particular, it defines hier-part as:

hier-part   = &quot;//&quot; authority path-abempty
                  / path-absolute
                  / path-rootless
                  / path-empty

Which, when expanding these components into their definitions, corresponds to:

hier-part
          = &quot;//&quot; authority
          | &quot;//&quot; authority 1*( &quot;/&quot; segment )
          | &quot;/&quot; [ segment-nz *( &quot;/&quot; segment ) ]
          | segment-nz *( &quot;/&quot; segment )
          | 0&lt;pchar&gt;

Note that the last two alternatives do not start with &quot;/&quot;, yet are still considered a &quot;hierarchical&quot; part by RFC3986. For example, the following URIs match this syntax, with hier-part mapping to path-rootless:

about:blank
file:foo/bar
urn:example.net:foo:bar

In order to avoid confusion, it may be desirable to use a different term in HTML5 than &quot;hierarchical URL&quot; in this regard. Alternatively, a note could be added which distinguishes the defined usage from the like named (but different) constructs in RFC3986.

I would also note that, in terms of the definitions found in 2.6.1, all &quot;authority-based URLs&quot; are also &quot;hierarchical URLs&quot;. I can&apos;t tell if this is intentional or not, if it is, then perhaps a note indicating this would be useful.

Regards,
Glenn</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>43716</commentid>
    <comment_count>1</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2011-01-01 05:50:13 +0000</bug_when>
    <thetext>I&apos;ll look into this in more detail once Adam&apos;s spec on how to parse URLs is ready. From a quick glance, though, it seems not too unreasonable to come up with different terminology if there&apos;s a better term than &quot;hierarchical&quot; here. Any suggestions?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>43745</commentid>
    <comment_count>2</comment_count>
    <who name="Adam Barth">w3c</who>
    <bug_when>2011-01-01 21:51:32 +0000</bug_when>
    <thetext>I&apos;ve been using the term &quot;standard URL&quot; but that might not be the optimal term either.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>53394</commentid>
    <comment_count>3</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2011-08-04 05:13:26 +0000</bug_when>
    <thetext>mass-move component to LC1</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>74781</commentid>
    <comment_count>4</comment_count>
    <who name="Anne">annevk</who>
    <bug_when>2012-09-28 10:51:51 +0000</bug_when>
    <thetext>http://url.spec.whatwg.org/ defines URLs now. Per that document a URL is always &quot;absolute&quot; (perhaps invalid, but always absolute). The input to the parsing algorithm may be relative to something else, but you always end up with URL that has all the relevant information (although it could be invalid if there&apos;s relative input and nothing to resolve it to).</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>