This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 13408 - UA should use element locale for i18n
Summary: UA should use element locale for i18n
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: LC1 HTML5 spec (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords: a11y, a11ytf
Depends on:
Blocks:
 
Reported: 2011-07-28 14:47 UTC by Cameron Jones
Modified: 2012-01-24 23:01 UTC (History)
10 users (show)

See Also:


Attachments

Description Cameron Jones 2011-07-28 14:47:20 UTC
The specification currently notes that input controls are to be presented in the user's preferred locale. This contradicts the specification of the lang attributes and does not allow for an author to explicitly declare i18n input controls. 

If the input element does not explicitly declare a lang attribute, the language should be determined as specified in "3.2.3.3 The lang and xml:lang attributes"

This bug is opened in response to "Last Call feedback on Date/Time controls":

http://lists.w3.org/Archives/Public/public-html/2011Jun/0425.html
Comment 1 Michael[tm] Smith 2011-08-04 05:03:48 UTC
mass-moved component to LC1
Comment 2 Henri Sivonen 2011-11-25 09:24:58 UTC
I think it makes sense to make the proposed spec change.

My own experience is that users assume that input fields on an en-US behaviors. In particular, non-technical users may not be able to tell which widgets come from the UA and which ones come from the site, so it's surprising if some parts of the page have behaviors that don't make sense in the context of the site.
Comment 3 Ian 'Hixie' Hickson 2011-12-02 18:54:02 UTC
I'll replace "the user's preferred locale" with "the user's preferred locale or the locale implied by the element's <span>language</span>."
Comment 4 Cameron Jones 2011-12-05 12:35:43 UTC
(In reply to comment #3)
> I'll replace "the user's preferred locale" with "the user's preferred locale or
> the locale implied by the element's <span>language</span>."

this is fine but the notes should not be explicitly required as language resolution is already defined. 

since the notes already exist it is probably better for clarity that they remain but with an updated text.

i would hesitate from describing the locale as being 'implied' and instead use 'defined' as an authoritative statement.

i suggest changing the note something like this:

"The language of an element is determined through language resolution algorithm defined in section 3.2.3.3. Browsers may provide configuration to override and present elements according to the conventions of a user's preferred locale."

This retains the authority of the page and specification but highlights how browsers may customize their user experience through configuration and as a point of differentiation. 

This will not violate the definition of the page if UA overrides are limited to presentational rendering and not used for value representation or form encoding. 

It does however provide for UAs to continue their current implementation behavior which is biased to en-US regional presentation but as explicit customizations and as their responsibility over altering authored applications.
Comment 5 Ian 'Hixie' Hickson 2011-12-09 23:21:18 UTC
Language and locale aren't the same thing. A language can at best imply a locale. Hence my proposed wording.

> "The language of an element is determined through language resolution algorithm
> defined in section 3.2.3.3. Browsers may provide configuration to override and
> present elements according to the conventions of a user's preferred locale."

If it's a note, it shouldn't contain the normative word "may". Also, we can't reference section numbers since they change all the time (and vary from edition to edition at any one time). 

I don't see how this is an improvement over the current note (with or without my proposed modification). What is the problem you are trying to solve?


> this is fine but the notes should not be explicitly required as language
> resolution is already defined. 

I don't understand what you mean here.
Comment 6 Cameron Jones 2011-12-13 11:57:01 UTC
(In reply to comment #5)
> Language and locale aren't the same thing. A language can at best imply a
> locale. Hence my proposed wording.

Yes, there is HTML language defined in BCP-47 and the OS locale which may exhibit POSIX codes. The application of a locale in this context is one-way and only of use in its language so the additional information in a locale has no relevance.

> 
> > "The language of an element is determined through language resolution algorithm
> > defined in section 3.2.3.3. Browsers may provide configuration to override and
> > present elements according to the conventions of a user's preferred locale."
> 
> If it's a note, it shouldn't contain the normative word "may". Also, we can't
> reference section numbers since they change all the time (and vary from edition
> to edition at any one time). 

I can see that section numbers can not be maintainable, it could be replaced with a link. the important factor is the reference to the normative section.

What's wrong with using "may" in a non-normative context? if a note is non-normative it shouldn't matter what prose is used? it makes no difference to use the word "can" or some other word denoting an option.

> 
> I don't see how this is an improvement over the current note (with or without
> my proposed modification). What is the problem you are trying to solve?
> 

the problem with the current note is that it *encourages* browsers to use user locale's language to localise any controls in preference to and as mandatory order-ride over the page\element language as defined by the author\server.

this is an issue because if a user navigates to a page in a different language the page will contain foreign text and formats yet the controls are still rendered in the user's native locale. 

take a simple example of a booking a US train ticket using a UK laptop. the date\time controls are rendered in UK locale but the page has been pre-rendered by the server to use US date formats, this results in a mix of MM/DD/YY and DD/MM/YY across the page. 

this creates confusion which exists through any cross-language browsing experience and especially in form controls as additional importance as an area that user input and clarity is required. 

the problem with the specification is that browsers should not be encouraged to use their user's locale by default. the language of the page has already been defined by the author and this should be used.

the statement that browsers "may" provide override configuration is more a suggestion of progressive enhancement so that if browsers wish to retain current behaviour believing that their users have come to expect the existing behaviour, they could do so as default configuration. at the very least this mandates the ability for users to disable automatic overriding of page\element language, yet i hope that i've been able to illustrate that this is a bug.

> 
> > this is fine but the notes should not be explicitly required as language
> > resolution is already defined. 
> 
> I don't understand what you mean here.

i hope i've clarified the issue with the above. i have no predisposition to what the exact change or text should be as long as the change addresses the existing contradiction.

i also note that at the end of normative section 3.2.3.3 it states:

"User agents may use the element's language to determine proper processing or rendering (e.g. in the selection of appropriate fonts or pronunciations, or for dictionary selection). "

Maybe this should also be updated to reference form controls?
Comment 7 Jukka K. Korpela 2012-01-12 21:09:46 UTC
(In reply to comment #3)
> I'll replace "the user's preferred locale" with "the user's preferred locale or
> the locale implied by the element's <span>language</span>."

I don’t see the change as implemented, so I guess it was just a proposal. In any case, it would not solve the problem—rather, it would introduce additional vagueness.

In general, the new features like input type="date" have not been considered much from the localization point of view. As such, they tend to create confusion. When using, say, a page in English on a browser with German as the “user’s preferred locale” (an ambiguous concept), how could the user expect that 1.005 will be taken as one thousand and five, against the conventions used on the page?

It just gets worse if the browser can apply either the “user’s preferred locale” or the element’s language (if indicated in markup or HTTP headers, I presume).

The only solid basis is the language/locale of the page itself, as indicated in markup. It will then be up to authors to indicate it if they use the new features. If browsers allow users to override such features, using e.g. the “user locale” on all pages, so be it, but this should not be said in a specification; it would just confuse implementors (and authors).

The question then arises which locale definitions shall be used. I’m afraid any realistic specification needs to allow a fallback to a “neutral” (i.e., US-centric) locale, when the page locale is not supported by the implementation if input routines. Otherwise, I suggest that the CLDR definitions be applied. They are not perfect, but the best we’ve got, and they have (in principle at least) a mechanism for proposing changes and filing bugs.

I think the API should specify a way to query support to a specific locale. That is, if your page is in Swahili, you should be able to query whether input type="date" supports Swahili, so that you can do something about it (like using some alternate input) if there is no support.
Comment 8 Ian 'Hixie' Hickson 2012-01-20 22:41:11 UTC
(In reply to comment #6)
> 
> Yes, there is HTML language defined in BCP-47 and the OS locale which may
> exhibit POSIX codes. The application of a locale in this context is one-way and
> only of use in its language so the additional information in a locale has no
> relevance.

I'm not sure what you're saying here.


> What's wrong with using "may" in a non-normative context?

"may" is defined to have normative meaning. Please read the spec's introduction, conformance, and terminology sections for a detailed discussion of such issues.


> the problem with the current note is that it *encourages* browsers to use user
> locale's language to localise any controls in preference to and as mandatory
> order-ride over the page\element language as defined by the author\server.

I don't see how it encourages it any more than using the page's locale. Exactly which happens is really up to the user agent and the user. Personally, I would much rather my browser localise everything to use my own preferred formats (the ISO8601 ones) than ever use either my locale's or the page's locale, because my locale's format (UK), and the formats of the pages I use (US), are both very confusing (DD/MM/YY and MM/DD/YY respectively).


> take a simple example of a booking a US train ticket using a UK laptop. the
> date\time controls are rendered in UK locale but the page has been pre-rendered
> by the server to use US date formats, this results in a mix of MM/DD/YY and
> DD/MM/YY across the page. 

Indeed. Personally I'd want my browser to use my own preferred format for exactly that reason.

But now consider the case of a Swiss French user browsing a Japanese site. Were I in this situation, I would prefer that my locale be used (with French month names) than that the Japanese locale be used (since I would have no chance of understanding what I was doing in that locale).

Similarly, consider the case of a browser that supports the user's locale but does not support the site's locale. How can it use the site's locale?


> i also note that at the end of normative section 3.2.3.3 it states:
> 
> "User agents may use the element's language to determine proper processing or
> rendering (e.g. in the selection of appropriate fonts or pronunciations, or for
> dictionary selection). "
> 
> Maybe this should also be updated to reference form controls?

Sure, I'll do that.


(In reply to comment #7)
> 
> In general, the new features like input type="date" have not been considered
> much from the localization point of view.

While you may disagree with the results, I assure you that localisation was very much considered.


> As such, they tend to create
> confusion. When using, say, a page in English on a browser with German as the
> “user’s preferred locale” (an ambiguous concept), how could the user expect
> that 1.005 will be taken as one thousand and five, against the conventions used
> on the page?

I don't understand the question.


> I think the API should specify a way to query support to a specific locale.
> That is, if your page is in Swahili, you should be able to query whether input
> type="date" supports Swahili, so that you can do something about it (like using
> some alternate input) if there is no support.

Please file separate bugs for feature requests.
Comment 9 Ian 'Hixie' Hickson 2012-01-20 23:03:19 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Accepted
Change Description: see diff given below
Rationale: see discussion above
Comment 10 Cameron Jones 2012-01-21 14:37:34 UTC
I agree with the change-set applied in resolution of this bug.

Localization is a complex issue affecting the entire set of authors, implementers and users. 

The new text ensure that authors can specify their intentions accurately, that implementers can customize their products for competitive and targeted consumption, and that users should be capable of choosing and configuring their software for the browsing experience they desire.

Closing this bug.
Comment 11 Cameron Jones 2012-01-24 17:57:47 UTC
Sorry, i just noticed that the changes to the note were only applied to the Date and Time state (type="datetime"), also the provided example seems to have escaped the containing box. I assume this is was not intended?
Comment 12 Cameron Jones 2012-01-24 18:03:03 UTC
Also should have left to be VERIFIED, will close when due process is completed.
Comment 13 Ian 'Hixie' Hickson 2012-01-24 21:19:25 UTC
reopening for comment 11
Comment 14 Ian 'Hixie' Hickson 2012-01-24 22:59:47 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Accepted
Change Description: see diff given below
Rationale: Sorry about that. I didn't realise the full extent of the required changes. I've had another go, factoring out the notes and replacing them with references to bigger sections about the requirements here.
Comment 15 contributor 2012-01-24 23:00:44 UTC
Checked in as WHATWG revision r6912.
Check-in comment: Fix the changes in r6905 to be more consistent and thorough.
http://html5.org/tools/web-apps-tracker?from=6911&to=6912
Comment 16 Ian 'Hixie' Hickson 2012-01-24 23:01:17 UTC
(BTW: if there's anything else wrong with the checkin please don't hesitate to reopen this bug. I don't normally notice if a bug has pending feedback if it is in a resolved state.)