Bug 8207 - Change definition of URL to normative reference to IRIBIS
Change definition of URL to normative reference to IRIBIS
Status: RESOLVED FIXED
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML5 spec (editor: Ian Hickson)
unspecified
PC Linux
: P1 normal
: ---
Assigned To: Ian 'Hixie' Hickson
HTML WG Bugzilla archive list
http://lists.w3.org/Archives/Public/p...
: NE, WGDecision
: 7391 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-11-05 21:45 UTC by Sam Ruby
Modified: 2011-04-14 22:18 UTC (History)
9 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sam Ruby 2009-11-05 21:45:48 UTC
Details of the proposed change can be found here:

http://lists.w3.org/Archives/Public/public-html/2009Nov/0153.html
Comment 1 Ian 'Hixie' Hickson 2009-12-12 15:13:50 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: What URL should I use to refer to IRI bis?
Comment 2 Julian Reschke 2009-12-12 15:49:41 UTC
Are you asking for a URL, or for a revision of iri-bis?

I personally would use <http://tools.ietf.org/html/draft-duerst-iri-bis-07>, but <http://www.ietf.org/id/draft-duerst-iri-bis-07.txt> would do as well.

I'm pretty sure you're asking for something else though, so maybe you could elaborate.
Comment 3 Larry Masinter 2009-12-13 06:54:30 UTC
I suggest using a reference:

[IRIBIS] "Internationalized Resource Identifiers (IRIs)", Work in Progress, <http://tools.ietf.org/html/draft-duerst-iri-bis>. Note that the (proposed) charter for IETF work on this document has as a requirement that the document remain suitable as a normative reference from HTML. The IETF work on this document is scheduled for completion before the scheduled Candidate Recommendation for HTML5 and before the likely close of Last Call for this document. It will be necessary to track both during the W3C Last Call period.
   
Comment 4 Ian 'Hixie' Hickson 2009-12-16 02:45:15 UTC
I looked at doing this, but the IRIbis draft isn't yet in a state where I can really do this. There's no algorithm that defines how to resolve an arbitrary string against an absolute base URL, as far as I can tell; in particular, nothing seems to take into account the HRef-charset so as to encode characters differently in different parts of the string. There's no definition of "valid URL" that I can refer to (that takes into account the "HRef-charset"). The parsing algorithm is destructive (e.g. the <path> of "http://example.com/%X" is, as far as I can tell, 5 characters long ("/%25X"), not three as required by Web compat ("/%X"). There's no definition of "absolute URL" that I can use (mostly because the current parsing algorithms are destructive).

This is all assuming that the split should be as it is now; this may not be a good assumption. If we should move the interface a bit, that may change matters. For example, it seems to me we probably what the "HRef-charset" definition in HTML5, rather than in the IRI spec.

Please advise on how I should proceed.
Comment 5 Larry Masinter 2009-12-27 22:22:49 UTC
Please clarify what you think is missing:

 http://lists.w3.org/Archives/Public/public-html/2009Nov/att-0670/iri-rewrite-draft.html

attached to

http://lists.w3.org/Archives/Public/public-html/2009Nov/0670.html

contains a complete proposed rewrite of section 2.5.1 including the definition of:
The term resolve (in the context of resolve a URL relative to another URL), which seems to address:

"There's no algorithm that defines how to resolve an arbitrary
  string against an absolute base URL"

Perhaps the text in the "iri-rewrite-draft.html" belongs in the IRI document itself, though, but this adjustment can be coordinated between the two documents.

Comment 6 Ian 'Hixie' Hickson 2010-01-04 08:08:04 UTC
That text is woefully vague. For example the only conformance criteria in the definition of "resolve" is "these parsed components may then be recombined", which isn't even a requirement (it's a may, not a must).

I'm looking for something which is unambiguous, absolutely clear, precise, and explicit about how two incoming Unicode strings and a character encoding are turned into a single Unicode string. Currently this simply isn't present in the detail needed to interoperably implement URL resolution in the face of erroneous URLs. Ideally this would mean a set of steps introduced with a MUST requirement, which can be blindly implemented with little thought, with no interpretation needed.
Comment 7 Julian Reschke 2010-02-04 16:35:47 UTC
Just a few notes:

- the new IETF IRI WG has started work, see http://tools.ietf.org/html/draft-ietf-iri-3987bis-00; so that's what should be discussed

- HTML5 currently refers to WEBADRESSES, which in http://www.w3.org/html/wg/href/draft#resolving-urls, contains stuff that shouldn't be going into IRIbis because of HTML dependencies:

"Otherwise, let base be the base URI of the element, as defined by the XML Base specification, with the base URI of the document entity being defined as the document base Web address of the Document that owns the element. [XMLBASE]

For the purposes of the XML Base specification, user agents must act as if all Document objects represented XML documents."

This should be fixed ASAP, because it is (and has been for a long time) blocking progress.

I recommend that we come to a consensus what the interface between HTML5 and IRIBIS should be (who is defining what; for things like "resolving" what are the parameters?). Once we have that, we should be able to move things forward.
Comment 8 Julian Reschke 2010-02-04 16:44:55 UTC
(In reply to comment #6)
> That text is woefully vague. For example the only conformance criteria in the
> definition of "resolve" is "these parsed components may then be recombined",
> which isn't even a requirement (it's a may, not a must).

It's just prose. You do not need a MUST to define how something is done.

I realize you disagree with that, but trying to impose a specific specification style on others doesn't seem to be productive.

> I'm looking for something which is unambiguous, absolutely clear, precise, and
> explicit about how two incoming Unicode strings and a character encoding are
> turned into a single Unicode string. Currently this simply isn't present in the
> detail needed to interoperably implement URL resolution in the face of
> erroneous URLs. Ideally this would mean a set of steps introduced with a MUST
> requirement, which can be blindly implemented with little thought, with no
> interpretation needed.

It would be helpful if you could elaborate what's wrong; maybe by adding an example (I totally believe that something might be wrong, but it's unproductive to do just handwaving).
Comment 9 Maciej Stachowiak 2010-03-30 04:02:33 UTC
*** Bug 7391 has been marked as a duplicate of this bug. ***
Comment 10 Maciej Stachowiak 2010-03-30 04:03:39 UTC
Some time ago this was escalated as:
http://www.w3.org/html/wg/tracker/issues/56
Comment 11 Ian 'Hixie' Hickson 2010-04-04 07:16:18 UTC
*** Bug 9138 has been marked as a duplicate of this bug. ***
Comment 12 Ian 'Hixie' Hickson 2010-04-04 07:16:36 UTC
(From bug 9138:)
WEBADDRESSES contains stuff that needs to go back into HTML5, as it wouldn't be
a candidate for inclusion in IRIbis (for instance,
<http://www.w3.org/html/wg/href/draft>, Part 3 depends on specifics of HTML and
the DOM).
Comment 13 Larry Masinter 2010-04-04 22:43:16 UTC
I scanned the HTML5 document looking for references to [WEBADDRESSES] and the only references were in the definitions covered by the change proposal. So I don't know what else actually needs to be moved back into the  HTML document.

That is, my belief is that the change proposal for ISSUE-56 which is the tracker issue for this bug... well, that it's complete, ready to go, and, if the change were accepted, we'd be done as far as getting HTML 5 to last call.
Comment 15 Adam Barth 2011-03-19 15:21:11 UTC
(In reply to comment #14)
> WG Decision: http://lists.w3.org/Archives/Public/public-html/2011Mar/0397.html

I don't understand.  That message seems to be about Content-Language, not URLs.
Comment 16 Sam Ruby 2011-03-19 15:39:30 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > WG Decision: http://lists.w3.org/Archives/Public/public-html/2011Mar/0397.html
> 
> I don't understand.  That message seems to be about Content-Language, not URLs.

My bad.  Correction posted here:

http://lists.w3.org/Archives/Public/public-html/2011Mar/0404.html

- Sam Ruby
Comment 17 Ian 'Hixie' Hickson 2011-04-08 23:14:47 UTC
I presume that this decision is specifically to revert this change:
   http://html5.org/tools/web-apps-tracker?from=3244&to=3245
...and to fix any conflicts resulting. If so, should I also add a warning that the text is known to be quite wrong and that people should ignore it for now? Or is it safe to assume people will ignore it?
Comment 18 Sam Ruby 2011-04-14 00:52:03 UTC
(In reply to comment #17)
> I presume that this decision is specifically to revert this change:
>    http://html5.org/tools/web-apps-tracker?from=3244&to=3245
> ...and to fix any conflicts resulting. If so, should I also add a warning that
> the text is known to be quite wrong and that people should ignore it for now?
> Or is it safe to assume people will ignore it?

Ian: not sure what could what could have caused that link to rot since the proposal was created; but yes, that does appear to be the change referenced by http://lists.w3.org/Archives/Public/public-html/2010Jul/0035.html

Adam: can you work with Ian to resolve any conflicts?

Ian: if you have new information w.r.t. bugs, please share.
Comment 19 Adam Barth 2011-04-14 00:52:44 UTC
> Adam: can you work with Ian to resolve any conflicts?

Sure.
Comment 20 Maciej Stachowiak 2011-04-14 00:56:42 UTC
(In reply to comment #17)
> I presume that this decision is specifically to revert this change:
>    http://html5.org/tools/web-apps-tracker?from=3244&to=3245
> ...and to fix any conflicts resulting. If so, should I also add a warning that
> the text is known to be quite wrong and that people should ignore it for now?
> Or is it safe to assume people will ignore it?

I think one or more bugs would be a better way to indicate known problems than a blanket warning in the spec.
Comment 21 contributor 2011-04-14 22:18:15 UTC
Checked in as WHATWG revision r6007.
Check-in comment: apply wg decision
http://html5.org/tools/web-apps-tracker?from=6006&to=6007