This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 11579 - Support for internationalized e-mail addresses
Summary: Support for internationalized e-mail addresses
Status: RESOLVED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: LC1 HTML5 spec (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-20 10:50 UTC by contributor
Modified: 2011-10-21 11:22 UTC (History)
13 users (show)

See Also:


Attachments
Test form with email input and IDN domain value. (275 bytes, text/html)
2011-01-18 13:35 UTC, jorritv
Details

Description contributor 2010-12-20 10:50:06 UTC
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/states-of-the-type-attribute.html
Section: http://www.whatwg.org/specs/web-apps/current-work/#e-mail-state

Comment:
The valid-email address ABNF doesn't include IDN, so for it to work the value
must be punycode (or at least converted to punycode when validating), but this
isn't mentioned, except as an optional example ("User agents may transform the
value for display and editing (e.g. converting punycode in the value to IDN in
the display and vice versa)."

Posted from: 83.160.94.175
Comment 1 Aryeh Gregor 2010-12-20 15:48:05 UTC
Maybe this should be upgraded to a "should".  Are actual implementations heeding the optional note?  It would be bad if not.
Comment 2 jorritv 2010-12-20 17:36:35 UTC
(I was the one who originally submitted this.)

(In reply to comment #1)
> Maybe this should be upgraded to a "should".  Are actual implementations
> heeding the optional note?  It would be bad if not.


They're not. At least Opera isn't, which I believe is the only browser supporting the e-mail validation at the moment. IDNs fail to validate in Opera.

That shortcoming led me to the spec.

The optional note ("User agents may transform ...") isn't in the paragraph talking about validation, so it's not immediately clear you need to do that if you want IDNs to validate (well, it should be obvious, but still).
Comment 3 jorritv 2011-01-18 13:35:33 UTC
Created attachment 944 [details]
Test form with email input and IDN domain value.

Firefox 4 (beta 9) has added support for form field validation too I see now, and it doesn't validate IDNs in e-mail addresses either.

I've added a simple form attachment to test.
Comment 4 Ms2ger 2011-01-18 16:23:53 UTC
I don't believe the spec should have any above-may-level requirements for UI, as they don't affect interoperability in user agents.
Comment 5 Aryeh Gregor 2011-01-21 12:37:53 UTC
Users being unable to enter e-mail addresses with IDNs is not a UI issue at all -- it's a functionality issue.  Saying that it's a UI issue just because they could theoretically convert their IDN to punycode and enter it that way really stretches the definition of "UI issue" past the breaking point, since practically no users will know how to do that.

This does affect real-world interoperability, because it means that authors who care about IDN support will have to work around lack of browser support in whichever browsers don't support IDNs.  This is basically the definition of non-interoperability.  The guideline that we don't specify UI only exists because UI doesn't affect interoperability, so it has to be discarded when it does (if this even is a UI issue).
Comment 6 Mounir Lamouri 2011-01-24 13:07:40 UTC
I think this should be marked WFM given that the specifications currently say that "User agents may transform the value for display and editing (e.g. converting punycode in the value to IDN in the dispay and vice versa).".
Comment 7 Aryeh Gregor 2011-01-27 00:05:37 UTC
Changes summary to clarify that this is not WORKSFORME.
Comment 8 Anne 2011-02-04 12:35:50 UTC
Also, I think such email addresses should go in their IDN-form to the server. Punycode is only to be used when the protocol is limited to ASCII, which is not the case here.
Comment 9 Aryeh Gregor 2011-02-04 16:58:37 UTC
Is there any specific reason why we wouldn't want to send IDNs to the server?  Is there a substantial amount of software these days that will handle raw IDNs incorrectly?
Comment 10 Ian 'Hixie' Hickson 2011-03-04 00:58:48 UTC
Isn't e-mail (SMTP) one of the systems that doesn't handle IDN yet? I assumed it was, which is why I didn't allow IDN to go to the server. If it's not I'm happy to change it. File another bug for that with evidence that typical servers won't be screwed up if they naïvely pass IDN to their mail subsystems.

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Partially Accepted
Change Description: see diff given below
Rationale: 

I've made IDN support in input a "should" for e-mails and URLs, to see if that encourages UAs to do better here. If they don't have a good reason not to support it, they should, which is what "should" means. Supporting IDN does seem equivalent here to supporting user input at all, which is also a should.
Comment 11 contributor 2011-03-04 00:59:13 UTC
Checked in as WHATWG revision r5934.
Check-in comment: s/may/should/ on IDN support in input.
http://html5.org/tools/web-apps-tracker?from=5933&to=5934
Comment 12 Mounir Lamouri 2011-07-11 14:12:44 UTC
(In reply to comment #10)
> Isn't e-mail (SMTP) one of the systems that doesn't handle IDN yet? I assumed
> it was, which is why I didn't allow IDN to go to the server. If it's not I'm
> happy to change it. File another bug for that with evidence that typical
> servers won't be screwed up if they naïvely pass IDN to their mail subsystems.

I do not think this is a good reason to not submit UTF-8 email addresses. Currently, websites using <input type='text'> get UTF-8 e-mail addresses so they obviously know how to handle them. If they want a client-side check, the pattern attribute would do the job. In addition, it looks like there is some work around a standard for internationalized email addresses [1].

I think the specifications should just make clear that UTF-8 e-mail addresses should be validated. Whether by requesting them to be puny-encoded before the validation or by extending the ABNF used for validation to include UTF-8 characters (like in [1]).

[1] http://tools.ietf.org/html/rfc5335#section-4.1
Comment 13 Anne 2011-07-11 14:33:57 UTC
This would need to deal with accept-charset not being UTF-8 or if that is not present the site not being UTF-8.
Comment 14 Mounir Lamouri 2011-07-11 14:42:38 UTC
(In reply to comment #13)
> This would need to deal with accept-charset not being UTF-8 or if that is not
> present the site not being UTF-8.

How <input type='email'> with UTF-8 values is different from other input with UTF-8 values?
Comment 15 Anne 2011-07-11 14:47:18 UTC
Like URLs, emails have to be UTF-8-encoded to work. Maybe we should not worry about that, however...
Comment 16 Michael[tm] Smith 2011-08-04 05:12:41 UTC
mass-move component to LC1
Comment 17 Ian 'Hixie' Hickson 2011-08-06 03:43:41 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: 

(In reply to comment #12)
> Currently, websites using <input type='text'> get UTF-8 e-mail addresses so
> they obviously know how to handle them.

I assure you that's not a given. I would expect that any scripts I've written, if given non-ASCII e-mail addresses, will either crash or complain. I haven't tested other systems but I wouldn't be surprised to find similar results.

As I said in comment 10: If my assumption is wrong, I'm happy to change the spec. File another bug for that *with evidence* that typical servers won't be screwed up if they naïvely pass IDN to their mail subsystems.
Comment 18 jorritv 2011-08-06 12:39:09 UTC
Closing as fixed (not wontfix) since the SHOULD in the spec now ensures that people will actually be able to enter IDNs and have them work.
Comment 19 Julian Reschke 2011-08-06 13:12:09 UTC
(In reply to comment #18)
> Closing as fixed (not wontfix) since the SHOULD in the spec now ensures that
> people will actually be able to enter IDNs and have them work.

I don't think this can be considered "fixed" when the spec defines "valid email" differently.

Re-opening (Ian, re-close as WONTFIX if you feel like it - I'm just making sure this doesn't have a misleading state)
Comment 20 Aryeh Gregor 2011-08-08 14:42:37 UTC
Re-resolving per comment 17.