This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24170 - scheme:// does not trigger hierarchical URI parsing
Summary: scheme:// does not trigger hierarchical URI parsing
Status: RESOLVED WORKSFORME
Alias: None
Product: WHATWG
Classification: Unclassified
Component: URL (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal
Target Milestone: Unsorted
Assignee: Anne
QA Contact: sideshowbarker+urlspec
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-12-26 21:44 UTC by Santiago M. Mola
Modified: 2014-01-11 16:47 UTC (History)
1 user (show)

See Also:


Attachments

Description Santiago M. Mola 2013-12-26 21:44:47 UTC
RFC 3986 and the URL Standard differ significantly in what's considered a hierarchical URL (one with relative scheme) or an opaque URL (one with scheme data).

In RFC 3986, the "scheme://" is enough to trigger hierarchical URI parsing. While in URL Standard, anything other than the defined schemes (http, https, ws, wss, file, ftp, gopher) triggers opaque URI parsing.

So, shouldn't be the default parsing as close to RFC 3986 as possible? And then apply scheme-specific quirks such as enforcing hierarchical URI parsing for http et al., omit default ports, etc.
Comment 1 Santiago M. Mola 2014-01-02 16:17:25 UTC
Actually, this might be out of scope for the specification. Doing full relative scheme parsing on an unknown scheme doesn't seem something sane nor useful in a web browser. It's going to delegate any unknown scheme to a 3rd party app anyway.

The proposed parsing behaviour only makes sense for applications beyond the web browser scope or for an hypothetical browser implementing native support for, let's say, git.

Maybe this should be totally disregarded, or included as a non-normative note...
Comment 2 Anne 2014-01-10 13:32:52 UTC
Well this is not what browsers are doing. We might want to have the feature though at some point to allow other schemes to be parsed in a similar way, not sure.
Comment 3 Santiago M. Mola 2014-01-10 17:44:45 UTC
OK. I've researched further into this issue:

- If this change is implemented, parsing of ed2k links will fail (they start with ed2k:// but they're not hierarchical). This would break opening ed2k links in a browser.

- Parsing behaviour for current schemes might not be applicable to other schemes. For example, in acap URLs (RFC 2244), userInfo is not interpreted as "<userword>:<password>", but "<user>;AUTH=<type>".

- In some cases, current parsing behaviour just leads to a parsing error. For example, aaa URLs (e.g. aaa://host.example.com:1813;transport=udp;protocol=radius) will fail because "port is not valid".

- Some others explicitely apply further restrictions in encoding, allowed parts (no query, or no fragment), etc. Actually, this is the case for git.

So in any case, any new scheme would need to be registered and added to the spec if it conforms to current rules, or new parsing rules would be needed.

TL;DR: Too many things could go wrong.
Comment 4 Anne 2014-01-11 16:47:45 UTC
Thanks!