This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/states-of-the-type-attribute.html Section: http://www.whatwg.org/specs/web-apps/current-work/#e-mail-state Comment: The valid-email address ABNF doesn't include IDN, so for it to work the value must be punycode (or at least converted to punycode when validating), but this isn't mentioned, except as an optional example ("User agents may transform the value for display and editing (e.g. converting punycode in the value to IDN in the display and vice versa)." Posted from: 83.160.94.175
Maybe this should be upgraded to a "should". Are actual implementations heeding the optional note? It would be bad if not.
(I was the one who originally submitted this.) (In reply to comment #1) > Maybe this should be upgraded to a "should". Are actual implementations > heeding the optional note? It would be bad if not. They're not. At least Opera isn't, which I believe is the only browser supporting the e-mail validation at the moment. IDNs fail to validate in Opera. That shortcoming led me to the spec. The optional note ("User agents may transform ...") isn't in the paragraph talking about validation, so it's not immediately clear you need to do that if you want IDNs to validate (well, it should be obvious, but still).
Created attachment 944 [details] Test form with email input and IDN domain value. Firefox 4 (beta 9) has added support for form field validation too I see now, and it doesn't validate IDNs in e-mail addresses either. I've added a simple form attachment to test.
I don't believe the spec should have any above-may-level requirements for UI, as they don't affect interoperability in user agents.
Users being unable to enter e-mail addresses with IDNs is not a UI issue at all -- it's a functionality issue. Saying that it's a UI issue just because they could theoretically convert their IDN to punycode and enter it that way really stretches the definition of "UI issue" past the breaking point, since practically no users will know how to do that. This does affect real-world interoperability, because it means that authors who care about IDN support will have to work around lack of browser support in whichever browsers don't support IDNs. This is basically the definition of non-interoperability. The guideline that we don't specify UI only exists because UI doesn't affect interoperability, so it has to be discarded when it does (if this even is a UI issue).
I think this should be marked WFM given that the specifications currently say that "User agents may transform the value for display and editing (e.g. converting punycode in the value to IDN in the dispay and vice versa).".
Changes summary to clarify that this is not WORKSFORME.
Also, I think such email addresses should go in their IDN-form to the server. Punycode is only to be used when the protocol is limited to ASCII, which is not the case here.
Is there any specific reason why we wouldn't want to send IDNs to the server? Is there a substantial amount of software these days that will handle raw IDNs incorrectly?
Isn't e-mail (SMTP) one of the systems that doesn't handle IDN yet? I assumed it was, which is why I didn't allow IDN to go to the server. If it's not I'm happy to change it. File another bug for that with evidence that typical servers won't be screwed up if they naïvely pass IDN to their mail subsystems. EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Partially Accepted Change Description: see diff given below Rationale: I've made IDN support in input a "should" for e-mails and URLs, to see if that encourages UAs to do better here. If they don't have a good reason not to support it, they should, which is what "should" means. Supporting IDN does seem equivalent here to supporting user input at all, which is also a should.
Checked in as WHATWG revision r5934. Check-in comment: s/may/should/ on IDN support in input. http://html5.org/tools/web-apps-tracker?from=5933&to=5934
(In reply to comment #10) > Isn't e-mail (SMTP) one of the systems that doesn't handle IDN yet? I assumed > it was, which is why I didn't allow IDN to go to the server. If it's not I'm > happy to change it. File another bug for that with evidence that typical > servers won't be screwed up if they naïvely pass IDN to their mail subsystems. I do not think this is a good reason to not submit UTF-8 email addresses. Currently, websites using <input type='text'> get UTF-8 e-mail addresses so they obviously know how to handle them. If they want a client-side check, the pattern attribute would do the job. In addition, it looks like there is some work around a standard for internationalized email addresses [1]. I think the specifications should just make clear that UTF-8 e-mail addresses should be validated. Whether by requesting them to be puny-encoded before the validation or by extending the ABNF used for validation to include UTF-8 characters (like in [1]). [1] http://tools.ietf.org/html/rfc5335#section-4.1
This would need to deal with accept-charset not being UTF-8 or if that is not present the site not being UTF-8.
(In reply to comment #13) > This would need to deal with accept-charset not being UTF-8 or if that is not > present the site not being UTF-8. How <input type='email'> with UTF-8 values is different from other input with UTF-8 values?
Like URLs, emails have to be UTF-8-encoded to work. Maybe we should not worry about that, however...
mass-move component to LC1
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Rejected Change Description: no spec change Rationale: (In reply to comment #12) > Currently, websites using <input type='text'> get UTF-8 e-mail addresses so > they obviously know how to handle them. I assure you that's not a given. I would expect that any scripts I've written, if given non-ASCII e-mail addresses, will either crash or complain. I haven't tested other systems but I wouldn't be surprised to find similar results. As I said in comment 10: If my assumption is wrong, I'm happy to change the spec. File another bug for that *with evidence* that typical servers won't be screwed up if they naïvely pass IDN to their mail subsystems.
Closing as fixed (not wontfix) since the SHOULD in the spec now ensures that people will actually be able to enter IDNs and have them work.
(In reply to comment #18) > Closing as fixed (not wontfix) since the SHOULD in the spec now ensures that > people will actually be able to enter IDNs and have them work. I don't think this can be considered "fixed" when the spec defines "valid email" differently. Re-opening (Ian, re-close as WONTFIX if you feel like it - I'm just making sure this doesn't have a misleading state)
Re-resolving per comment 17.