This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
One of the constraint validation on the input type="email" [1] talks about punicode conversion: [[ Constraint validation: While the user interface is representing input that the user agent cannot convert to punycode, the control is suffering from bad input. ]] After talking with the guys from the i18n WG (Richard Ishida, Martin J. Dürst and John C Klensin), it appears this section is not accurate and should be probably revised. John C Klensin: [[ Ideally, it should remove the discussion of the Punycode algorithm and "punycode strings" entirely. They have never been correct and, with the approval of IDNA2008, became less so. They should be replaced it with a discussion of U-labels and A-labels with a reference to RFC 5890-5893 and a caution, per RFC 6055, that, if a putative domain name is seen by the browser application in U-label form, it should be kept in that form as long as possible. It would probably also be wise to advise that only A-labels (and potentially U-labels) be used when writing HTML references -- whatever the merits of the ongoing arguments about mappings and support for different mappings in different implementations, it is definitely much safer and less prone to ambiguity and IDNA2003 -> IDNA2008 and Unicode version differences to use the A-label form. If an implementation is using a high-quality IDN resolver library (or name resolution algorithm that incorporates one), that is probably all the HTML writer needs to know. If someone is trying to evaluate such an implementation, they really need to rely on the IDNA RFCs not a summary in the HTML spec. ]] [1] http://www.w3.org/html/wg/drafts/html/CR/forms.html#e-mail-state-%28type=email%29
See bug 15489 and bug 18162.
Mathias, Yes, but let's be a little careful. For better or worse, the bug 15489 and 18162 threads are ultimately about an enhancement to handle email addresses that SMTP (as well as older versions of HTML) did ot anticipate, namely those that contain non-ASCII characters. If those characters are entirely in the domain part of the address, there is an obvious workaround by using A-labels (the Unicode strings encoded via the Punycode algorithm into an ASCII-compatible form) Whether that is desirable or not is a UI problem and may be an operating system one (See RFC 6055). This particular problem, as far as I can tell, is a terminology one that was wrong but that appeared mostly harmless until it led to confusion about appropriate test cases and the like. There is no required substantive change to implementations or extension, just a need to correct the description to avoid confusion or worse. It now appears to be worth the effort to just fix the language and be done with it. john
HTML5.1 Bugzilla Bug Triage: Incubation needed This bug constitutes a request for a new feature of HTML. The current guidelines [1], rather than track such requests as bugs or issues, please create a proposal outlining the desired behavior, or at least a sketch of what is wanted (much of which is probably contained in this bug), and start the discussion/proposal in the WICG [2]. As your idea gains interest and momentum, it may be brought back into HTML through the Intent to Migrate process [3]. [1] https://github.com/w3c/html#contributing-to-this-repository [2] https://www.w3.org/community/wicg/ [3] https://wicg.github.io/admin/intent-to-migrate.html