Bug 9035 - Interfaces for URL manipulation: concept of "hierarchical"
Interfaces for URL manipulation: concept of "hierarchical"
Status: RESOLVED FIXED
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML5 spec (editor: Ian Hickson)
unspecified
PC Windows NT
: P3 normal
: ---
Assigned To: Ian 'Hixie' Hickson
HTML WG Bugzilla archive list
urls
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-02-17 09:35 UTC by Julian Reschke
Modified: 2010-10-04 14:48 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Julian Reschke 2010-02-17 09:35:39 UTC
It would be good to specify precisely what "hierarchical" means.

In RFC 3986, the distinction does not exist anymore.

For instance, for

  data:,A%20brief%20note%25foo#bar

IE8 returns

  ,A%20brief%20note%25foo

as pathname, which intuitively seems to be the right thing, given the URI grammar.
Comment 1 Ian 'Hixie' Hickson 2010-02-23 09:36:16 UTC
In a document with URL "data:text/html,aa/bb/cc", how is a relative URL "../" supposed to be processed?
Comment 2 Julian Reschke 2010-02-23 09:44:51 UTC
I'm not sure what the answer is; but it does not depend on the scheme. Section 5 of RFC 3986 should have all the details.
Comment 3 Ian 'Hixie' Hickson 2010-02-23 09:52:45 UTC
It _did_ depend on the scheme. As far as I can tell, implementations still think it depends on the scheme, and thus RFC3986 is incorrect. I think the right solution here might be to refer to the previous versions of the RFCs (specifically 2396) for the definition of "hierarchical".
Comment 4 Julian Reschke 2010-02-23 10:02:15 UTC
If you seriously believe there's a bug in RFC 3986 then by all means report it (rfc-editor errata page, or on the URI mailing list).
Comment 5 Julian Reschke 2010-02-23 13:30:06 UTC
(In reply to comment #3)
> It _did_ depend on the scheme. As far as I can tell, implementations still
> think it depends on the scheme, and thus RFC3986 is incorrect. I think the
> right solution here might be to refer to the previous versions of the RFCs
> (specifically 2396) for the definition of "hierarchical".

- If HTML5 makes it depend on the property of being "hierarchical", it either needs to define that term, or reference the definition somewhere else. Currently it doesn't, and that's why I raised the bug.

- The definition can not hard-wire specific scheme names, because that would make it impossible to deploy new schemes; so it needs to be sufficient to inspect the given base URI, given the fact that the scheme may be unknown.

- That being said, the example given above (IE8 parsing data URIs) seems to indicate that implementations do not uniformly special-case non-hierarchical URIs; thus it should be considered to drop the distinction here.



Comment 6 Ian 'Hixie' Hickson 2010-04-04 07:40:00 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Partially Accepted
Change Description: see diff given below. I went with something more like what webkit does, basing it on the characters after the scheme.
Rationale: The URL specs changed in a non-backwards-compatible fashion, so the HTML5 spec had to either reference the older drafts or change how it used them. The latter seemed easiest.
Comment 7 contributor 2010-04-04 07:43:21 UTC
Checked in as WHATWG revision r4965.
Check-in comment: Define how a URL is established as being 'hierarchical' or 'authority-based'.
http://html5.org/tools/web-apps-tracker?from=4964&to=4965