This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24660 - Constraint validation on input type="email" and punycode conversion
Summary: Constraint validation on input type="email" and punycode conversion
Status: RESOLVED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: HTML5 spec (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 enhancement
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-02-14 09:24 UTC by Denis Ah-Kang
Modified: 2016-04-22 21:52 UTC (History)
7 users (show)

See Also:


Attachments

Description Denis Ah-Kang 2014-02-14 09:24:39 UTC
One of the constraint validation on the input type="email" [1] talks about punicode conversion:
[[
Constraint validation: While the user interface is representing input that the user agent cannot convert to punycode, the control is suffering from bad input.
]]

After talking with the guys from the i18n WG (Richard Ishida, Martin J. Dürst and John C Klensin), it appears this section is not accurate and should be probably revised.

John C Klensin:
[[
Ideally, it should remove the discussion of the Punycode
algorithm and "punycode strings" entirely.  They have never been
correct and, with the approval of IDNA2008, became less so.
They should be replaced it with a discussion of U-labels and
A-labels with a reference to RFC 5890-5893 and a caution, per
RFC 6055, that, if a putative domain name is seen by the browser
application in U-label form, it should be kept in that form as
long as possible.   It would probably also be wise to advise
that only A-labels (and potentially U-labels) be used when
writing HTML references -- whatever the merits of the ongoing
arguments about mappings and support for different mappings in
different implementations, it is definitely much safer and less
prone to ambiguity and IDNA2003 -> IDNA2008 and Unicode version
differences to use the A-label form.

If an implementation is using a high-quality IDN resolver
library (or name resolution algorithm that incorporates one),
that is probably all the HTML writer needs to know. If someone
is trying to evaluate such an implementation, they really need
to rely on the IDNA RFCs not a summary in the HTML spec.
]]


[1] http://www.w3.org/html/wg/drafts/html/CR/forms.html#e-mail-state-%28type=email%29
Comment 1 Mathias Bynens 2014-02-14 09:35:52 UTC
See bug 15489 and bug 18162.
Comment 2 John C Klensin 2014-02-14 16:58:53 UTC
Mathias, 
Yes, but let's be a little careful.  For better or worse, the bug 15489 and 18162 threads are ultimately about an enhancement to handle email addresses that SMTP (as well as older versions of HTML) did ot anticipate, namely those that contain non-ASCII characters.  If those characters are entirely in the domain part of the address, there is an obvious workaround by using A-labels (the Unicode strings encoded via the Punycode algorithm into an ASCII-compatible form)  Whether that is desirable or not is a UI problem and may be an operating system one (See RFC 6055).

This particular problem, as far as I can tell, is a terminology one that was wrong but that appeared mostly harmless until it led to confusion about appropriate test cases and the like.  There is no required substantive change to implementations or extension, just a need to correct the description to avoid confusion or worse.  It now appears to be worth the effort to just fix the language and be done with it.

   john
Comment 3 Arron Eicholz 2016-04-22 21:52:48 UTC
HTML5.1 Bugzilla Bug Triage: Incubation needed

This bug constitutes a request for a new feature of HTML. The current guidelines [1], rather than track such requests as bugs or issues, please create a proposal outlining the desired behavior, or at least a sketch of what is wanted (much of which is probably contained in this bug), and start the discussion/proposal in the WICG [2]. As your idea gains interest and momentum, it may be brought back into HTML through the Intent to Migrate process [3].
[1] https://github.com/w3c/html#contributing-to-this-repository
[2] https://www.w3.org/community/wicg/
[3] https://wicg.github.io/admin/intent-to-migrate.html