This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 22286 - Would be good to give constraints on values UAs should provide (autofill/autocomplete)
Summary: Would be good to give constraints on values UAs should provide (autofill/auto...
Status: RESOLVED NEEDSINFO
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: Other other
: P1 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-06-05 18:28 UTC by contributor
Modified: 2014-01-02 23:13 UTC (History)
6 users (show)

See Also:


Attachments

Description contributor 2013-06-05 18:28:11 UTC
Specification: http://www.whatwg.org/specs/web-apps/current-work/
Multipage: http://www.whatwg.org/C#autofilling-form-controls:-the-autocomplete-attribute
Complete: http://www.whatwg.org/c#autofilling-form-controls:-the-autocomplete-attribute
Referrer: 

Comment:
Would be good to give constraints on values UAs should provide

Posted from: 2620:0:1000:147c:e6ce:8fff:fe07:8094 by ian@hixie.ch
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1521.3 Safari/537.36
Comment 1 Ian 'Hixie' Hickson 2013-06-05 18:34:29 UTC
In particular, for the following fields, if the value described below is one that can be given in the field, we should say it _must_ be given, and the current rules about trying to fit the patterns should only apply if this preferred form can't be given. Other fields are free-form.

cc-number - must be numeric
cc-exp - must be in the form YYYY-MM
cc-exp-month - must be two digits in the range 01..12
cc-exp-year - must be numeric
cc-csc - must be numeric
bday - must be in the form YYYY-MM-DD
bday-day - must be two digits in the range 01..31
bday-month - must be two digits in the range 01..12
bday-year - must be numeric
url - must be a valid URL
photo - if the field is type=file, must be a JPEG or PNG; otherwise, must be a valid URL
tel - must be numeric
tel-country-code - must be numeric
tel-national - must be numeric
tel-area-code - must be numeric
tel-local - must be numeric
tel-local-prefix - must be numeric
tel-local-suffix - must be numeric
tel-extension - must be numeric
email - must be a valid e-mail address (punycoded)
impp - must be a valid URL
language - ISO language code (maybe?)
Comment 2 Ian 'Hixie' Hickson 2013-06-05 19:30:41 UTC
("numeric" meaning 0-9 only, no spaces, no punctuation, etc.)
Comment 3 Albert Bodenhamer 2013-06-11 00:40:49 UTC
Additional suggestions for i18n:
tel should be an E164 formatted number.
country should be a country code (ISO 3166?)

Open questions:
Language: There are some advantages to having language be either a user entered string or a code of some sort (BCP 47).  It really depends on the context in which the site is requesting the info.  If it's for deciding how to display something to the user BCP 47 makes the most sense.  If it's for something only human-readable (a profile page on a social site for example) then a user-entered string makes the most sense.

Birthdate: Not all users use the Gregorian calendar.  Proposal: If a site requests birthdate the UA should be able to provide it as a Gregorian date.  If the UA chooses to support expressing the date on other calendars it must be able to convert the birthdate from that calendar to Gregorian YYYY-MM-DD.

Subfields: The subfields in the spec (tel-area-code, family-name) currently show a Western bias.  That's noted in the spec and their use is discouraged, but should we consider adding additional subfields to enable broader support?
Comment 4 Ian 'Hixie' Hickson 2013-06-17 23:25:51 UTC
> tel should be an E164 formatted number.

What does this mean, exactly?


For country and language I went with the codes, on the assumption that most people use <select>s to let users pick an answer. It has the disadvantage that in _less_ restricted fields, the country code is still a valid value, which will be ugly. Suggestions? Different canonical values for different types of fields, maybe?


> Birthdate: Not all users use the Gregorian calendar.  Proposal: If a site
> requests birthdate the UA should be able to provide it as a Gregorian date. 
> If the UA chooses to support expressing the date on other calendars it must
> be able to convert the birthdate from that calendar to Gregorian YYYY-MM-DD.

This is basically already what the spec says, in not so many words.


> Subfields: The subfields in the spec (tel-area-code, family-name) currently
> show a Western bias.  That's noted in the spec and their use is discouraged,
> but should we consider adding additional subfields to enable broader support?

I think we should discourage these fields entirely. The other option doesn't scale.
Comment 5 contributor 2013-06-17 23:27:27 UTC
Checked in as WHATWG revision r7983.
Check-in comment: Clarify formats for autofill fields
http://html5.org/tools/web-apps-tracker?from=7982&to=7983
Comment 6 Albert Bodenhamer 2013-06-19 18:13:22 UTC
(In reply to comment #4)
> > tel should be an E164 formatted number.
> 
> What does this mean, exactly?
http://www.itu.int/rec/T-REC-E.164-201011-I/en

It's a numbering scheme for expressing phone numbers in a consistent way.  Basically its all of the info needed to express a phone number with no extraneous formatting or symbols applied.  For example, Google's main number would look like +16502530000.  Google's Zurich number would be +41446681800.


It's supported by libphonenumber.  The comments in libphonenumber/src/phonenumbers/phonenumberutil.h say:
// INTERNATIONAL and NATIONAL formats are consistent with the definition
// in ITU-T Recommendation E. 123. For example, the number of the Google
// Zürich office will be written as "+41 44 668 1800" in INTERNATIONAL
// format, and as "044 668 1800" in NATIONAL format. E164 format is as per
// INTERNATIONAL format but with no formatting applied e.g. +41446681800.
// RFC3966 is as per INTERNATIONAL format, but with all spaces and other
// separating symbols replaced with a hyphen, and with any phone number
// extension appended with ";ext=".

> 
> 
> For country and language I went with the codes, on the assumption that most
> people use <select>s to let users pick an answer. It has the disadvantage
> that in _less_ restricted fields, the country code is still a valid value,
> which will be ugly. Suggestions? Different canonical values for different
> types of fields, maybe?

Yeah, it's tricky.  I could imagine someone just throwing up a text field and expecting something human readable to be directly fed into the box.

We could have "country-name" if people really wanted that, but I really don't like that idea.

I think just going with codes is the sanest thing for now in the absence of any other input.

> 
> 
> > Birthdate: Not all users use the Gregorian calendar.  Proposal: If a site
> > requests birthdate the UA should be able to provide it as a Gregorian date. 
> > If the UA chooses to support expressing the date on other calendars it must
> > be able to convert the birthdate from that calendar to Gregorian YYYY-MM-DD.
> 
> This is basically already what the spec says, in not so many words.

Ok.

> 
> 
> > Subfields: The subfields in the spec (tel-area-code, family-name) currently
> > show a Western bias.  That's noted in the spec and their use is discouraged,
> > but should we consider adding additional subfields to enable broader support?
> 
> I think we should discourage these fields entirely. The other option doesn't
> scale.

Sounds good.
Comment 7 Dan Beam 2013-06-27 18:47:08 UTC
(In reply to comment #6)
> (In reply to comment #4)
> > > tel should be an E164 formatted number.
> > 
> > What does this mean, exactly?
> http://www.itu.int/rec/T-REC-E.164-201011-I/en
> 
> It's a numbering scheme for expressing phone numbers in a consistent way. 
> Basically its all of the info needed to express a phone number with no
> extraneous formatting or symbols applied.  For example, Google's main number
> would look like +16502530000.  Google's Zurich number would be +41446681800.
> 
> 
> It's supported by libphonenumber.  The comments in
> libphonenumber/src/phonenumbers/phonenumberutil.h say:
> // INTERNATIONAL and NATIONAL formats are consistent with the definition
> // in ITU-T Recommendation E. 123. For example, the number of the Google
> // Zürich office will be written as "+41 44 668 1800" in INTERNATIONAL
> // format, and as "044 668 1800" in NATIONAL format. E164 format is as per
> // INTERNATIONAL format but with no formatting applied e.g. +41446681800.
> // RFC3966 is as per INTERNATIONAL format, but with all spaces and other
> // separating symbols replaced with a hyphen, and with any phone number
> // extension appended with ";ext=".
> 
> > 
> > 
> > For country and language I went with the codes, on the assumption that most
> > people use <select>s to let users pick an answer. It has the disadvantage
> > that in _less_ restricted fields, the country code is still a valid value,
> > which will be ugly. Suggestions? Different canonical values for different
> > types of fields, maybe?
> 
> Yeah, it's tricky.  I could imagine someone just throwing up a text field
> and expecting something human readable to be directly fed into the box.
> 
> We could have "country-name" if people really wanted that, but I really
> don't like that idea.

I think we need "country-name", as it satisfies the use case of filling a user-visible <input>.  Non-normatively, Chrome currently refuses to fill invisible fields.  This means site authors that request "country" will be forced to show a [likely cryptic] country code to their users.  This is unlikely to match the user's originally entered input, and will probably look odd or foreign ("I never typed that...") and lead to distrust.

We definitely need a country code version (as we have now), I think it's just best used solely by machines and not shown to users (e.g. a hidden input that's filled via HTMLFormElement#requestAutocomplete()).

Ilya Sherman (a key Chrome Autofill contributor) and I have further comments on the Chrome bug about implementing this change (at least in the context of HTMLFormElement#requestAutocomplete()) [1].

[1] https://code.google.com/p/chromium/issues/detail?id=254682#c6

> 
> I think just going with codes is the sanest thing for now in the absence of
> any other input.
> 
> > 
> > 
> > > Birthdate: Not all users use the Gregorian calendar.  Proposal: If a site
> > > requests birthdate the UA should be able to provide it as a Gregorian date. 
> > > If the UA chooses to support expressing the date on other calendars it must
> > > be able to convert the birthdate from that calendar to Gregorian YYYY-MM-DD.
> > 
> > This is basically already what the spec says, in not so many words.
> 
> Ok.
> 
> > 
> > 
> > > Subfields: The subfields in the spec (tel-area-code, family-name) currently
> > > show a Western bias.  That's noted in the spec and their use is discouraged,
> > > but should we consider adding additional subfields to enable broader support?
> > 
> > I think we should discourage these fields entirely. The other option doesn't
> > scale.
> 
> Sounds good.
Comment 8 Ian 'Hixie' Hickson 2013-07-02 23:44:43 UTC
For the telephone thing, as far as I can tell, what the spec has is basically the format you describe, though we don't limit it to 15 characters. Is that sufficient? We can make it more precise, but I'd hesitate to name that ITU standard normatively since it's quite hard to understand at a glance.

Country codes would work fine with <select>, no? Or do you count those values as not visible?

We could add country-name; would that just be "Free-form text, no newlines"? Would it have any correlation to the "country" field? If so, how/what?
Comment 9 Albert Bodenhamer 2013-07-03 16:04:10 UTC
(In reply to comment #8)
> For the telephone thing, as far as I can tell, what the spec has is
> basically the format you describe, though we don't limit it to 15
> characters. Is that sufficient? We can make it more precise, but I'd
> hesitate to name that ITU standard normatively since it's quite hard to
> understand at a glance.

The spec leaves a lot of room for variance though.  Will you get a leading 1 on US phone numbers?  How are country codes handled? etc.  The ITU spec addresses those questions.

I get what you're saying about it being hard to understand.  Part of that comes from phone numbers being stupidly complex, but the ITU doc is also not very clearly written.  I wonder if we could summarize somehow.
 
> 
> Country codes would work fine with <select>, no? Or do you count those
> values as not visible?

Country codes and a select should work, but that's a fair bit of effort for a site owner to take to maps between codes and names.

> 
> We could add country-name; would that just be "Free-form text, no newlines"?
> Would it have any correlation to the "country" field? If so, how/what?
Free-form text intended to be a human readable name of a country.  country and country-name should reference the same country?
Comment 10 Dan Beam 2013-07-03 21:44:56 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > For the telephone thing, as far as I can tell, what the spec has is
> > basically the format you describe, though we don't limit it to 15
> > characters. Is that sufficient? We can make it more precise, but I'd
> > hesitate to name that ITU standard normatively since it's quite hard to
> > understand at a glance.
> 
> The spec leaves a lot of room for variance though.  Will you get a leading 1
> on US phone numbers?  How are country codes handled? etc.  The ITU spec
> addresses those questions.
> 
> I get what you're saying about it being hard to understand.  Part of that
> comes from phone numbers being stupidly complex, but the ITU doc is also not
> very clearly written.  I wonder if we could summarize somehow.
>  
> > 
> > Country codes would work fine with <select>, no? Or do you count those
> > values as not visible?
> 
> Country codes and a select should work, but that's a fair bit of effort for
> a site owner to take to maps between codes and names.
> 
> > 
> > We could add country-name; would that just be "Free-form text, no newlines"?
> > Would it have any correlation to the "country" field? If so, how/what?
> Free-form text intended to be a human readable name of a country.  country
> and country-name should reference the same country?

Country-related fields with the same section should correspond to the same country, yes (whereas "shipping country" and "billing country-name" may differ).
Comment 11 Ian 'Hixie' Hickson 2013-07-09 20:06:57 UTC
> The spec leaves a lot of room for variance though.  Will you get a leading 1
> on US phone numbers?  How are country codes handled? etc.  The ITU spec
> addresses those questions.

The spec requires it to be "ASCII digits and U+0020 SPACE characters, prefixed by a U+002B PLUS SIGN character (+)", so you'd have to get the leading 1, no?


> I get what you're saying about it being hard to understand.  Part of that
> comes from phone numbers being stupidly complex, but the ITU doc is also not
> very clearly written.  I wonder if we could summarize somehow.

They have per-country rules, which I'd be hesitant to hard-code, since I don't know how stable they are and don't want to spend the rest of my life chasing a moving target or addressing edge cases (nor do I imagine most browser vendors want to do this!). At a high level, what the HTML spec currently says is kind of a summary...

 
> Free-form text intended to be a human readable name of a country.  country
> and country-name should reference the same country?

How would browsers do this?
Comment 12 Albert Bodenhamer 2013-07-09 21:49:54 UTC
(In reply to comment #11)
> > The spec leaves a lot of room for variance though.  Will you get a leading 1
> > on US phone numbers?  How are country codes handled? etc.  The ITU spec
> > addresses those questions.
> 
> The spec requires it to be "ASCII digits and U+0020 SPACE characters,
> prefixed by a U+002B PLUS SIGN character (+)", so you'd have to get the
> leading 1, no?

Wouldn't that mean that +18005551212, +8005551212, +800 555 1212, and +80 0555 12 12 would all be equally valid?  The one I'd really want would be the first.

> 
> 
> > I get what you're saying about it being hard to understand.  Part of that
> > comes from phone numbers being stupidly complex, but the ITU doc is also not
> > very clearly written.  I wonder if we could summarize somehow.
> 
> They have per-country rules, which I'd be hesitant to hard-code, since I
> don't know how stable they are and don't want to spend the rest of my life
> chasing a moving target or addressing edge cases (nor do I imagine most
> browser vendors want to do this!). At a high level, what the HTML spec
> currently says is kind of a summary...

Ugh, yeah.  I don't think anything prevents random country X from throwing a wrench in the works.

> 
>  
> > Free-form text intended to be a human readable name of a country.  country
> > and country-name should reference the same country?
> 
> How would browsers do this?

Chrome currently has lookup tables.  For any given country the country code is fixed and it has a name determined by the current locale.
Comment 13 Ian 'Hixie' Hickson 2013-07-09 23:47:39 UTC
> Wouldn't that mean that +18005551212, +8005551212, +800 555 1212, and +80
> 0555 12 12 would all be equally valid?

Well, yeah, but aren't they all in fact valid? The first is a US number [1], the other three are the same number using the Universal International Freephone Number country code (+800), though they are all short one digit.

Admittedly, the US one is a known-bogus one, so arguably we shouldn't allow it, but I don't think we want browsers implementing that level of checking, as discussed in comment 11. Similarly, we presumably don't want to be checking the number of digits in +800 numbers.


> > > Free-form text intended to be a human readable name of a country.  country
> > > and country-name should reference the same country?
> > 
> > How would browsers do this?
> 
> Chrome currently has lookup tables.  For any given country the country code
> is fixed and it has a name determined by the current locale.

So what would you do if a form had both, and the user picked different values?
Or if the form had a county-name field, but the user entered "Hello"?

I think the only constraint we could put is "if the user agent autofills a form with fields using both 'country' and 'country-name', and the user agent provides a value for the field(s) using 'country', then the field(s) using 'country-name' must be filled using a human-readable name for the same country".

Would that be ok?

If so, I propose to add country-name as a free-form field, with just that requirement somewhere in prose (not in the spec's table(s)).
Comment 14 Albert Bodenhamer 2013-07-17 16:43:22 UTC
(In reply to comment #13)
> > Wouldn't that mean that +18005551212, +8005551212, +800 555 1212, and +80
> > 0555 12 12 would all be equally valid?
> 
> Well, yeah, but aren't they all in fact valid? The first is a US number [1],
> the other three are the same number using the Universal International
> Freephone Number country code (+800), though they are all short one digit.

Valid, yes, but not consistent.  I really like the idea of minimal surprises.  Maybe that would create too tight of a constraint though given the chaos of phone numbers. :-/

> 
> Admittedly, the US one is a known-bogus one, so arguably we shouldn't allow
> it, but I don't think we want browsers implementing that level of checking,
> as discussed in comment 11. Similarly, we presumably don't want to be
> checking the number of digits in +800 numbers.
> 
> 
> > > > Free-form text intended to be a human readable name of a country.  country
> > > > and country-name should reference the same country?
> > > 
> > > How would browsers do this?
> > 
> > Chrome currently has lookup tables.  For any given country the country code
> > is fixed and it has a name determined by the current locale.
> 
> So what would you do if a form had both, and the user picked different
> values?
> Or if the form had a county-name field, but the user entered "Hello"?
> 
> I think the only constraint we could put is "if the user agent autofills a
> form with fields using both 'country' and 'country-name', and the user agent
> provides a value for the field(s) using 'country', then the field(s) using
> 'country-name' must be filled using a human-readable name for the same
> country".
> 
> Would that be ok?

That works.

> 
> If so, I propose to add country-name as a free-form field, with just that
> requirement somewhere in prose (not in the spec's table(s)).

SGTM
Comment 15 Dan Beam 2013-07-17 22:52:54 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > > Wouldn't that mean that +18005551212, +8005551212, +800 555 1212, and +80
> > > 0555 12 12 would all be equally valid?
> > 
> > Well, yeah, but aren't they all in fact valid? The first is a US number [1],
> > the other three are the same number using the Universal International
> > Freephone Number country code (+800), though they are all short one digit.
> 
> Valid, yes, but not consistent.  I really like the idea of minimal
> surprises.  Maybe that would create too tight of a constraint though given
> the chaos of phone numbers. :-/
> 
> > 
> > Admittedly, the US one is a known-bogus one, so arguably we shouldn't allow
> > it, but I don't think we want browsers implementing that level of checking,
> > as discussed in comment 11. Similarly, we presumably don't want to be
> > checking the number of digits in +800 numbers.
> > 
> > 
> > > > > Free-form text intended to be a human readable name of a country.  country
> > > > > and country-name should reference the same country?
> > > > 
> > > > How would browsers do this?
> > > 
> > > Chrome currently has lookup tables.  For any given country the country code
> > > is fixed and it has a name determined by the current locale.
> > 
> > So what would you do if a form had both, and the user picked different
> > values?
> > Or if the form had a county-name field, but the user entered "Hello"?
> > 
> > I think the only constraint we could put is "if the user agent autofills a
> > form with fields using both 'country' and 'country-name', and the user agent
> > provides a value for the field(s) using 'country', then the field(s) using
> > 'country-name' must be filled using a human-readable name for the same
> > country".
> > 
> > Would that be ok?
> 
> That works.
> 
> > 
> > If so, I propose to add country-name as a free-form field, with just that
> > requirement somewhere in prose (not in the spec's table(s)).
> 
> SGTM

+1
Comment 16 Ilya Sherman 2013-07-19 17:49:16 UTC
(In reply to comment #13)
> > > > Free-form text intended to be a human readable name of a country.  country
> > > > and country-name should reference the same country?
> > > 
> > > How would browsers do this?
> > 
> > Chrome currently has lookup tables.  For any given country the country code
> > is fixed and it has a name determined by the current locale.
> 
> So what would you do if a form had both, and the user picked different
> values?
> Or if the form had a county-name field, but the user entered "Hello"?
> 
> I think the only constraint we could put is "if the user agent autofills a
> form with fields using both 'country' and 'country-name', and the user agent
> provides a value for the field(s) using 'country', then the field(s) using
> 'country-name' must be filled using a human-readable name for the same
> country".
> 
> Would that be ok?

I think even this constraint is a little overly constrained.  For example, suppose a single <form> requests a shipping and a billing address, and happens to expect country codes for one (perhaps shipping, as only a subset of nations are supported) and country names for the other.  In this case, the user might well instruct the UA to fill different addresses into the two sections, and hence the countries wouldn't match.

That's admittedly a somewhat contrived example, but so is just about any other involving a form specifying both country and country-name, since the use cases are pretty different.  I think it's fine to leave country-name loosely spec'ed, as it's mostly intended to be used by existing sites that already have freeform text boxes for the country name.
Comment 17 Ian 'Hixie' Hickson 2013-07-27 15:56:26 UTC
Since this is blocking implementation, I'll try to get this ASAP.
Comment 18 Ian 'Hixie' Hickson 2013-07-29 20:23:13 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > > Wouldn't that mean that +18005551212, +8005551212, +800 555 1212, and +80
> > > 0555 12 12 would all be equally valid?
> > 
> > Well, yeah, but aren't they all in fact valid? The first is a US number [1],
> > the other three are the same number using the Universal International
> > Freephone Number country code (+800), though they are all short one digit.
> 
> Valid, yes, but not consistent.

I don't understand. How are they not consistent?

Of those four numbers, which one(s) would you disallow? The invalid US number, or the three invalid Universal International Freephone Numbers? And why?


(In reply to comment #16)
> > 
> > I think the only constraint we could put is "if the user agent autofills a
> > form with fields using both 'country' and 'country-name', and the user agent
> > provides a value for the field(s) using 'country', then the field(s) using
> > 'country-name' must be filled using a human-readable name for the same
> > country".
> 
> I think even this constraint is a little overly constrained.  For example,
> suppose a single <form> requests a shipping and a billing address, and
> happens to expect country codes for one (perhaps shipping, as only a subset
> of nations are supported) and country names for the other.  In this case,
> the user might well instruct the UA to fill different addresses into the two
> sections, and hence the countries wouldn't match.

Sure, but those would be using different autofill scopes, so that's fine.
Comment 19 contributor 2013-07-29 20:47:54 UTC
Checked in as WHATWG revision r8100.
Check-in comment: Add 'country-name' field to autofill section, and some editorial polish around there.
http://html5.org/tools/web-apps-tracker?from=8099&to=8100
Comment 20 Albert Bodenhamer 2013-07-30 23:19:59 UTC
(In reply to comment #18)
> (In reply to comment #14)
> > (In reply to comment #13)
> > > > Wouldn't that mean that +18005551212, +8005551212, +800 555 1212, and +80
> > > > 0555 12 12 would all be equally valid?
> > > 
> > > Well, yeah, but aren't they all in fact valid? The first is a US number [1],
> > > the other three are the same number using the Universal International
> > > Freephone Number country code (+800), though they are all short one digit.
> > 
> > Valid, yes, but not consistent.
> 
> I don't understand. How are they not consistent?

Inconsistent in that if I'm a site owner tagging a field as "tel" I have no way of knowing what the browser is going to fill into that field.

> 
> Of those four numbers, which one(s) would you disallow? The invalid US
> number, or the three invalid Universal International Freephone Numbers? And
> why?

It doesn't really matter.  I would just like to know which I'm going to get.

> 
> 
> (In reply to comment #16)
> > > 
> > > I think the only constraint we could put is "if the user agent autofills a
> > > form with fields using both 'country' and 'country-name', and the user agent
> > > provides a value for the field(s) using 'country', then the field(s) using
> > > 'country-name' must be filled using a human-readable name for the same
> > > country".
> > 
> > I think even this constraint is a little overly constrained.  For example,
> > suppose a single <form> requests a shipping and a billing address, and
> > happens to expect country codes for one (perhaps shipping, as only a subset
> > of nations are supported) and country names for the other.  In this case,
> > the user might well instruct the UA to fill different addresses into the two
> > sections, and hence the countries wouldn't match.
> 
> Sure, but those would be using different autofill scopes, so that's fine.
Comment 21 Ian 'Hixie' Hickson 2013-07-31 21:33:57 UTC
> Inconsistent in that if I'm a site owner tagging a field as "tel" I have no
> way of knowing what the browser is going to fill into that field.

I don't understand why not. How do you have any less of an idea what it'll fill in than the "locality" or "name" fields?


> > Of those four numbers, which one(s) would you disallow? The invalid US
> > number, or the three invalid Universal International Freephone Numbers? And
> > why?
> 
> It doesn't really matter.  I would just like to know which I'm going to get.

I don't understand the question. Won't you just get whichever one the user entered?

I feel like I've misunderstood a fundamental aspect of your concern here.
Comment 22 Albert Bodenhamer 2013-07-31 22:38:33 UTC
(In reply to comment #21)
> > Inconsistent in that if I'm a site owner tagging a field as "tel" I have no
> > way of knowing what the browser is going to fill into that field.
> 
> I don't understand why not. How do you have any less of an idea what it'll
> fill in than the "locality" or "name" fields?

Locality and name are much more arbitrary.  Each is effectively a single opaque symbol.  You MIGHT be able to do some minimal validation on them, but there really aren't any operations that make sense on them.

A phone number is more like a piece of code.  If I hand a properly structured phone number to either a human or a properly equipped machine the result should be a successful phone call.  There should be no ambiguity as to how that call should be executed.

> 
> 
> > > Of those four numbers, which one(s) would you disallow? The invalid US
> > > number, or the three invalid Universal International Freephone Numbers? And
> > > why?
> > 
> > It doesn't really matter.  I would just like to know which I'm going to get.
> 
> I don't understand the question. Won't you just get whichever one the user
> entered?
> 
> I feel like I've misunderstood a fundamental aspect of your concern here.

Yeah.  I'm not doing a great job of making myself understood.  Sorry.  I'm glad to chat face-to-face if it helps.

Let me approach it from a different direction.  Having a canonical value for an autocomplete field means that the browser can perform meaningful validation as soon as the user provides data.  It also means that if the site requests data in the canonical form it shouldn't have to validate data at all.

Today, a user could enter "+42" as its phone number.  That's a legal value in the spec and the browser may choose to accept it.  A site owner who requested the phone number in a form probably will NOT consider "+42" to be a valid phone number and will throw some sort of validation error at the user.  It would have been a much better experience to validate earlier.

This sort of thing becomes a bigger issue if the site uses something like the proposed requestAutocomplete method.  If the browser is required to provide validated canonical data, the site can often avoid doing validation AT ALL.  This is a huge win since a validation error in a requestAutocomplete style flow is VERY disruptive.
Comment 23 Ian 'Hixie' Hickson 2013-08-01 17:26:01 UTC
So what are you proposing instead?
Comment 24 Ian 'Hixie' Hickson 2013-11-13 21:26:49 UTC
I don't see how that problem is solvable. At the end of the day, figuring out the canonical form of most of these data types is not something we can seriously put into every browser, because it changes over time, has numerous obscure exceptions, and is political. It's like time zones, which we similarly don't really constrain.

Also, we'll never be in a world where the server can't validate input. You can't trust the client to not be malicious.

So I don't know where to go from here.
Comment 25 Ian 'Hixie' Hickson 2014-01-02 23:13:37 UTC
I'm marking this NEEDSINFO; please reopen if you have a proposal. I can't work out what to do to address the issue in comment 22.