This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 23005 - IDNA-related bits should instead reference terminology from http://url.spec.whatwg.org/
Summary: IDNA-related bits should instead reference terminology from http://url.spec.w...
Status: RESOLVED FIXED
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:
Depends on: 23891
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-19 11:04 UTC by contributor
Modified: 2014-01-07 21:12 UTC (History)
3 users (show)

See Also:


Attachments

Description contributor 2013-08-19 11:04:36 UTC
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/origin-0.html
Multipage: http://www.whatwg.org/C#unicode-serialization-of-an-origin
Complete: http://www.whatwg.org/c#unicode-serialization-of-an-origin
Referrer: http://www.whatwg.org/specs/web-apps/current-work/multipage/

Comment:
IDNA-related bits should instead reference terminology from
http://url.spec.whatwg.org/

Posted from: 207.218.72.65 by annevk@annevk.nl
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:26.0) Gecko/20100101 Firefox/26.0
Comment 1 Ian 'Hixie' Hickson 2013-10-30 23:24:55 UTC
Do you have specific text in mind, or should I work it out?
Comment 2 Anne 2013-10-31 10:54:43 UTC
http://www.whatwg.org/specs/web-apps/current-work/#unicode-serialization-of-an-origin
-> It seems this algorithm is also defined in https://tools.ietf.org/html/rfc6454 Why is it still in HTML? Or are we abandoning the RFC per my emails to the WHATWG list? You could use http://url.spec.whatwg.org/#concept-domain-to-unicode and then serialize the output if you decide to keep it. Note that this requires the host part of the origin to be stored as a sequence of labels or IPv6. I.e. the result of http://url.spec.whatwg.org/#concept-host-parser

http://www.whatwg.org/specs/web-apps/current-work/#ascii-serialization-of-an-origin
-> See above. You could use http://url.spec.whatwg.org/#concept-domain-to-ascii and then serialize the output if you decide to keep it.

http://www.whatwg.org/specs/web-apps/current-work/#dom-document-domain
-> You want to use http://url.spec.whatwg.org/#concept-host-parser and http://url.spec.whatwg.org/#concept-host-serializer here. And maybe keep internal state on the non-serialized value. You cannot really say "if value is IPv4" as IPv4 and domain names are not distinguishable at the syntax level. A host is either IPv6 or a set of domain labels.
Comment 3 Ian 'Hixie' Hickson 2013-10-31 19:21:20 UTC
This text predates that RFC. I just haven't updated it since. I agree that it might be worth just ignoring that RFC though.

What do you mean, you can't identify an IPv4 address? If it's four numeric labels, isn't it unambiguously an IPv4 address?
Comment 4 Anne 2013-10-31 19:59:10 UTC
Reportedly that is up to the stack that gets passed the domain labels to figure out. There are no restrictions in DNS either that prevent a host from looking like an IPv4 address.

As for the RFC, there's a mailing list thread. Need to get Adam Barth to comment on it.
Comment 5 Ian 'Hixie' Hickson 2013-11-19 22:04:57 UTC
I agree that domain names might be a superset of IPv4 addresses, but that doesn't mean you can't identify an IPv4 address.

(In reply to Anne from comment #2)
> http://www.whatwg.org/specs/web-apps/current-work/#unicode-serialization-of-
> an-origin
> -> It seems this algorithm is also defined in
> https://tools.ietf.org/html/rfc6454 Why is it still in HTML? Or are we
> abandoning the RFC per my emails to the WHATWG list? You could use
> http://url.spec.whatwg.org/#concept-domain-to-unicode and then serialize the
> output if you decide to keep it. Note that this requires the host part of
> the origin to be stored as a sequence of labels or IPv6. I.e. the result of
> http://url.spec.whatwg.org/#concept-host-parser

I don't really understand the benefit of invoking url.spec.whatwg.org here. All that does is invoke the same algorithm the HTML spec invokes today, no? Wouldn't that just be adding one step of indirection?

Similar questions with the others. I don't really understand what problem we're solving here.
Comment 6 Anne 2013-11-20 12:02:24 UTC
This might become moot if all browsers move to IDNA2008 or UTS #46 or some such. However, right now what HTML is not defining and URL is, is that it's not exactly IDNA2003 but IDNA2003 with an updated version of Unicode (not 3.2).
Comment 7 contributor 2013-11-22 20:38:41 UTC
Checked in as WHATWG revision r8312.
Check-in comment: Mark areas that need updating for this bug (moving IDNA references to URL standard references).
http://html5.org/tools/web-apps-tracker?from=8311&to=8312
Comment 8 Ian 'Hixie' Hickson 2014-01-07 20:23:14 UTC
Ok, done. Thanks for the explanation, and for taking the IDNA issue. :-)
Comment 9 contributor 2014-01-07 21:12:14 UTC
Checked in as WHATWG revision r8381.
Check-in comment: Defer to URL spec for IDNA stuff.
http://html5.org/tools/web-apps-tracker?from=8380&to=8381