This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 9392 - The ABNF looks wrong. Trailing periods shouldn't be allowed before the @. Shouldn't it be: 1*atext *("." 1*atext) ?
Summary: The ABNF looks wrong. Trailing periods shouldn't be allowed before the @. Sh...
Status: RESOLVED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML5 spec (editor: Ian Hickson) (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: LC
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:
: 9404 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-04-03 00:22 UTC by contributor
Modified: 2010-10-04 14:00 UTC (History)
8 users (show)

See Also:


Attachments

Description contributor 2010-04-03 00:22:57 UTC
Section: http://www.whatwg.org/specs/web-apps/current-work/#e-mail-state

Comment:
The ABNF looks wrong. Trailing periods shouldn't be allowed before the @. 
Shouldn't it be: 1*atext *("." 1*atext) ?

Posted from: 98.234.184.167
Comment 1 Ms2ger 2010-04-03 08:29:09 UTC
They should be. See <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-August/022486.html>.
Comment 2 Ms2ger 2010-04-04 18:33:05 UTC
*** Bug 9404 has been marked as a duplicate of this bug. ***
Comment 3 Maciej Stachowiak 2010-04-04 22:18:39 UTC
Reopening so that this gets an editor's response. This and duplicate 9404 were not junk bugs, so they should get a proper response.
Comment 4 Ian 'Hixie' Hickson 2010-04-04 22:47:50 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: It turns out that trailing dots before the @ are used with real e-mail addresses, and that Wikipedia users actually use such e-mail addresses. Not supporting such addresses would mean Wikipedia couldn't use this feature.
Comment 5 Maciej Stachowiak 2010-04-05 03:35:56 UTC
Expanding on the editor's rationale - it seems like if there are more restrictive and less restrictive rules, each of which seems like a reasonable choice, then it's better for built-in client-side validation to use the less restrictive rule. That's because, if you want the more restrictive rule, you can add more validation (either with client-side script or on the server side), but if the client was already more restrictive and you wanted the less restrictive rule, there would be no workaround.
Comment 6 Lars Gunther 2010-04-05 13:31:39 UTC
There are two cases for the first part of an email address according to RFCs:

1. If in between quotes almost anything goes.

2. If not quoted the rules are restrictive.

Wikipedia will run into trouble the moment they implement FILTER_VALIDATE_EMAIL as it is. Their workaround will need to be to put quotes around those addresses that currently are allowed, but not according to the RFC. Such a workaround is feasible for the Mediawiki software and thus will not be a real hindrance to HTML5.

The following regular expression makes such a differentiation and is used in PHP as of the latest commits:

/^(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){255,})(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){65,}@)(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22))(?:\\.(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*\\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+)*)|(?:\\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\\]))$/i

A similar rule is used in Perl's official (CPAN) validation rule as well.

The argument to use such a rule in HTML5 is:

It's according to the standard.

It's thus the most sensible default.

Re: Maciej's argument: Power users can always implement whatever they like. The built in rules should be according to specs. JavaScript can be used to override the built in validation both ways: 1. Catch the error. 2. Re-test using your own, more permissive rule. 3. If OK, go ahead and submit.

This might be another option for Mediawiki.

If JavaScript can not override in this latter case, that in itself is a spec bug and needs consideration of its own. (And that's not only for email validation.)

The default should however not serve the power user. Actually, the less knowledge a developer has, the more restrictive the rule needs to be. As developer knowledge increases, he or she can write rules of his/her own, that relaxes the checks.


P.S:

Regexp copyright © Michael Rushton 2009-10
http://squiloople.com/
Feel free to use and redistribute this code. But please keep this copyright notice.
Comment 7 Maciej Stachowiak 2010-04-06 01:07:31 UTC
This bug was both reopened and had TrackerRequest added. That's incorrect process. Only one of these two should be done. Please let me know which you prefer:

(1) New round of consideration by the editor, in which case the bug should stay REOPENED and have TrackerRequest removed.

(2) Escalation to the full Working Group, in which case it should keep TrackerRequest and go back to RESOLVED.

If I do not hear back in a few days, I will assume option 1.
Comment 8 Ian 'Hixie' Hickson 2010-04-13 01:14:18 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale:

"It's according to the standard" does not imply "It's thus the most sensible default".

E-mail addresses exist that have a "." before the "@" and their users at least on occasion provide those e-mail addresses in a form that has no quote marks. Therefore, supporting this is the most sensible default.
Comment 9 Julian Reschke 2010-04-13 06:26:46 UTC
(In reply to comment #8)
> E-mail addresses exist that have a "." before the "@" and their users at least
> on occasion provide those e-mail addresses in a form that has no quote marks.
> Therefore, supporting this is the most sensible default.

From what I understand, the trailing dot is invalid, but there is a valid way to enter the address by quoting.

Why not specify that the UA should transform the address to legal format automatically?