This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 20072 - Make use of the URL Standard
Summary: Make use of the URL Standard
Status: RESOLVED FIXED
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-11-24 17:10 UTC by Anne
Modified: 2013-04-22 17:21 UTC (History)
2 users (show)

See Also:


Attachments

Description Anne 2012-11-24 17:10:59 UTC
http://url.spec.whatwg.org/

Quick guide (some questions):

* *Parsed URL* is the result of concept-url-parser /input/ with /base/, optionally with an /encoding override/. At this point you need to check if *parsed URL* is failure.

* concept-url-parser takes care of normalizing the host so no need to apply ToASCII. (This is not fully specified yet, because IDNA is not agreed upon, but the idea is that input is ASCII-fied and only turned into Unicode when requested. Do we only need Unicode when serializing or is having Unicode for the list of domain labels useful too?)

* I think we want to pass Fetch the parsed URL and not a serialized URL.

* HTMLAnchorElement and Location need to implement URLUtils and abide by its semantics. There's one more thing that needs to happen here is that there needs to be some hook for when the URL is manipulated. Suggestions for that? <a> would use that hook for a DOM mutation and Location would use it for navigation.
Comment 1 Anne 2012-11-29 13:11:35 UTC
I introduced concept-UU-update ("update steps") that HTMLAnchorElement can Location can use.
Comment 2 Ian 'Hixie' Hickson 2013-02-09 02:06:16 UTC
Please review!
Comment 3 contributor 2013-02-09 02:07:44 UTC
Checked in as WHATWG revision r7710.
Check-in comment: Integrate with URL standard.
http://html5.org/tools/web-apps-tracker?from=7709&to=7710
Comment 4 Anne 2013-02-12 13:04:22 UTC
1. Should I define valid URL and the various variants HTML prescribes? Or do we want to keep URLs surrounded by spaces and such restricted to HTML?

2. You misunderstood "parse error". The concept is similar to the HTML parser. The URL parser returning "failure" is indicative of an unresolvable problem.

3. If HTML fetch is invoked with a referrer source that is a URL, if that's not a parsed URL something goes wrong. Seems like an oversight.

4. title="relative url" should probably be title="relative URL".

5. Under WebSocket URLs you no longer have to ASCII lowercase things. "ws" will always come out as such, regardless of whether you typed "ws" or "WS". (The old text was buggy it seems as later on you do a literal comparison without lowercasing.) The same goes for the host component, although that is not defined yet.

(I also disagree with not reusing href, but we can discuss that in the separate bugs, right?)
Comment 5 Ian 'Hixie' Hickson 2013-02-13 00:33:43 UTC
Thanks!

No opinion on #1. If you do take these, please don't rename them, though.

Not surprised by #2 given bug 20914. :-P

Can you elaborate on #3? (I haven't checked this; if it's obvious then don't worry I'll figure it out.)
Comment 6 Anne 2013-02-13 12:25:19 UTC
3. What I meant that jumping from step 1 in HTML fetch to step 7 is probably wrong. It should jump to step 6 instead (unless referrer source is a parsed URL or a Document always).
Comment 7 Ian 'Hixie' Hickson 2013-04-09 17:35:01 UTC
For #3, I assume you fixed this in fetch.spec.whatwg.org?

(I fixed it here by just moving the label up one <li>.)

I fixed everything else here, I think.
Comment 8 Ian 'Hixie' Hickson 2013-04-09 18:15:25 UTC
There's quite a lot of boilerplate involved, more than there was before the URL spec, if I'm not mistaken. Search for "supports the URLUtils" in the HTML spec to find the four places where I refer to the URL spec for this. Note that they all have multiple paragraphs of nearly-identical boilerplate. IMHO, we should strive to avoid that kind of redundancy. For future specs (not this one, since this is now a sunk cost) we should aim to make it so that one can invoke the other spec with just one sentence and only the bits that are different between each occurrence, with no redundancy.
Comment 9 contributor 2013-04-09 18:18:44 UTC
Checked in as WHATWG revision r7795.
Check-in comment: Update integration with URL spec.
http://html5.org/tools/web-apps-tracker?from=7794&to=7795
Comment 10 Anne 2013-04-10 09:15:18 UTC
I'd be interested in seeing how you'd have defined this instead. It seems perfectly clear to me, without redundancy.

Also, I believe previously matters such the effects of setting .protocol were not at all defined.
Comment 11 Ian 'Hixie' Hickson 2013-04-12 06:32:09 UTC
Setting .protocol was defined... maybe not correctly, but... :-)

Regarding the boilerplate — I mean that things like this:
----------8<---------
When the element's URLUtils interface invokes its update steps, if the element's URLUtils interface's url is not null, then the user agent must set the element's href content attribute to the serialization of the element's URLUtils interface's url; otherwise, it must set the element's href content attribute to the element's URLUtils interface's input.
----------8<---------

Could instead be:
----------8<---------
When the element's URLUtils interface invokes its update steps with a /new value/, the user agent must set the element's href content attribute to /new value/.
----------8<---------

Similarly, the lines in <a>, <area>, and Location that talk about "set the input" could just be something like (for Location):

   The object's _underlying string_ is the address of the relevant Document.

...or (for <a>): 

   The object's _underlying string_ is the element's href="" content attribute.

But anyway, that's a moot point now, since the spec is written.
Comment 12 Anne 2013-04-22 17:21:22 UTC
As I explained before. The string passing for "update steps" would be relevant for the Location case. A shorthand for getting the url object serialized might be useful though. That or a choice of update steps?

And a concept of an underlying string is nice, but I think that's not how Location should work (that should simply be a URL object) and that does not explain things in terms of updates to the DOM very explicitly.