[url] follow-ups from the TPAC F2F Meeting

Minuted here:

http://www.w3.org/2014/10/28-webapps-minutes.html#item07

Note that this is a lengthy and comprehensive email covering a number of 
topics.  I encourage replies to have new subject lines and to limit 
themselves to only one part and to aggressively excerpt out the parts of 
this email that are not relevant to the reply.

---

Short term, there should be a heart-beat of the W3C URL document 
published ASAP.  The substantive content should be identical to the 
current WHATWG URL Standard.  The spec should say this, likely do so 
with a huge red tab at the bottom like the one that can be found in the 
following document:

http://www.w3.org/TR/2014/WD-encoding-20140603/

The Status section should also reference the current Formal Objections 
so that any readers of this document may be aware that the final 
disposition of this draft may be in the form of a tombstone note.  The 
current Formal Objections I am aware of are listed here:

https://www.w3.org/wiki/HTML/wg/UrlStatus#Formal_Objections

Finally, I would encourage the status section to mention bug 
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25946 so that readers may 
be aware that the URL parsing section may be rewritten.  This indirectly 
references the work I am about to describe, and it does so in a 
non-exclusive manner meaning that others are welcome to propose 
alternate resolutions.

I am willing to help with this effort.

---

Separately, at this time I would like to solicit feedback on some work I 
have been doing which includes a JavaScript reference implementation, a 
concrete albeit incomplete proposal for resolution to bug 25946, and 
some comparative test results with a number of browser and non-browser 
implementations.  For the impatient, here are some links:

http://intertwingly.net/projects/pegurl/liveview.html
http://intertwingly.net/projects/pegurl/url.html
http://intertwingly.net/projects/pegurl/urltest-results/

For those that want to roll up their proverbial sleeves and dive in, 
check out the code here:

https://github.com/rubys/url

You will find a list of prerequisites that you need to install first at 
the top of the Makefile.  Possible ways to contribute (in order of 
preference): pull requests, github issues, and emails to this 
(public-webapps@w3.org) mailing list.  I've already gotten and closed 
one, you can be next :-).

https://github.com/rubys/url/pulls?q=is%3Apr

My plans include addressing the Todos listed in the document, and begin 
work on the merge.  That work is complicated by a need to migrate the 
URL Standard from anolis to bikeshed.  You can see progress on that 
effort in a separate branch, as well as the discussion that has happened 
to date:

https://github.com/rubys/url/tree/anolis2bikeshed
https://github.com/rubys/url/commit/e617fd66135bd75b1052700081de5319914168a5#commitcomment-8259740

To be clear, my proposed resolution for bug 25946 requires this 
conversion, but this conversion doesn't require my proposed resolution 
to bug 25946.  I mention this as Anne seems to want this document to be 
converted, and that effort can be pulled separately.

---

Now to get to what I personally am most interested in: identifying 
changes to the expected test results, and therefore to the URL 
specification -- independent of the approach that specification takes to 
describing parsing.  To kick off the discussion, here are three examples:

1) http://intertwingly.net/projects/pegurl/urltest-results/7357a04b5b

A number of browsers, namely Internet Explorer, Opera(Presto), and 
Safari seem to be of the opinion that exposing passwords is a bad idea. 
  I suggest that this is a defensible position, and that the 
specification should either standardize on this approach or at a minimum 
permit this.

2) http://intertwingly.net/projects/pegurl/urltest-results/4b60e32190

This is not a valid URL syntax, nor does any browser vendor implement 
it.  I think it is fairly safe to say that given this state that there 
isn't a wide corpus of existing web content that depends on it.  I'd 
suggest that the specification be modified to adopt the behavior that 
Chrome, Internet Explorer, and Opera(Presto) implement.

3) http://intertwingly.net/projects/pegurl/urltest-results/61a4a14209

This is an example of a problem that Anne is currently wrestling with. 
Note in particular the result produced by Chrome, which identifies the 
host as a IPV4 address and canonicalizes it.

These are a few that caught my eye.  Feel free to comment on these, or 
any others, or even to propose new tests.

- Sam Ruby

Received on Thursday, 30 October 2014 01:55:13 UTC