Meeting minutes
eemeli: There is an explainer with much more detail than we're getting to.
<xfq> explainer
eemeli: session goal: improve English language
… Because Americans got their first en-US is the default English.
… Firefox usage data: many users go with default locale. 2/3 of the users using Firefox in the en-US locale are browsing from outside North America.
… The default experience is mediated through an American understanding of date formatting, temperature&mass units.
… In almost all dimension in which there is variance, en-US is an outlier.
Goal: provide better experience for users who don't want to or are unable to modify their experience.
… A solution: define a new locale is the default for certain regions. It uses US English textual content. But it has a better i18n experience for date formatting, everything formatting is more readily understood. Less weird.
… Working identifier: en-zz
… "zz" is recognized already as "unknown region". Not the only option, but is memorable.
… Advancing two things:
… - definition of this locale
… - use this locale once it becomes available
<michaelficarra> why not zz-zz if there's actually no indicated preference for english?
Goal: Want feedback.
… Where is this problematic?
… We are changing the default. We do not want to break.
… Do not change where we have hints that the person is in the US or is asking for en-US.
… Japan is a specific example. Both Japanese and en-US prefer Sunday as day 0 of the week.
… But en-zz would default to Monday as the first day in a datepicker. We need user research to figure out where exactly to roll it out, including Japan. Would like feedback from audience.
… No need for a new standard in the W3C. But want other groups to be aware and to use the en-zz locale where appropriate.
… Eventually this would make it into CLDR data tables.
hsivonen: why not zz-zz? There are users not in the US who explicitly want US english. The explainer has examples why. Bug reports say that even people who explicitly pick en-US as the UI language are unhappy with unit representation.
… Most people don't know how to enable en-zz. If without this the first language in the language picker would be en-US, we would prepend en-zz based on signals.
… (thanks)
… Datepickers in the Danish. Now that there are datepickers on the web, this would allow them to adapt to "week starts on Monday". This leads to expensive mistakes: booking hotels for the wrong day. Fahrenheit/Celsius leads to fewer and less expensive mistakes.
… General agreement on SI units, but the first-day-of-week breaks that assumption. Biggest risk: not solving date picker problem because SI units handled otherwise.
michaelficarra: en-zz would have all the fields set to most common worldwide.
eemeli: close but not quite. Not always the most common. Most standardized. YYYY-MM-DD for date-time formatting even though it's not the most common.
<benoit> would it try to default to some ISO standard format?
hsivonen: In the explainer, CLDR 001 by doing ISO date format. ISO first day of week is always Monday for 001. Right now in 001, min days is not four. Parallel debate whether it's good idea to have min days in 001. Google calendar ran into trouble.
… In all the locales where people care about week numbering, people use ISO week numbering. Complaints went down for Google Calendar.
… US people don't care but if they coordinate with a European better to use ISO week numbers.
eemeli: for people not familiar with CLDR rules, (echoes xfq note on) 001 is the region subtag for world. 150 is Europe. 419 for Latin America and the Caribbean. Plays into inheritence.
michaelficarra: the English part is that as spoken in some part.
hsivonen: there are some languages in CLDR: arabic with a paradigm split. en paradigm. If you only have en and apply likely subtags, it expands missing subtags (MacOS does this de facto). Other paradigm is British English paradigm. 001 -> en-GB. Other regions are "owned" by one of those paradigms.
… en-US is used out of locale a lot. Moreso than any other locale. Motivation: you don't care about which English, so we can't change spelling based on a heuristic. It affects spellchecking. Users would reject. It has to tie to linguistic aspects. Can for things that trigger on region -> 001 except for week numbering and date formatting.
johanneswilm: ime, in the smaller Scandinavian countries, smaller than Germany, more people don't understand English. Fewer bug reports available. I hope this doesn't take away from being able to localize. People from all over the world. Come in with laptop from US. Let that organization's webpage. Sometimes the organization wants to mandate
… a convention for week numbering. Organization overrides for widgets like date-pickers.
… It is not like that today.
eemeli: A server can specify locale for served page.
hsivonen: talking about different datepickers. eemeli is talking about JS library case.
<xfq> overriding the data picker's locale is discussed at w3c/
hsivonen: This anecdote about buying train tickets in Denmark. Some JS library . Common use case for buying train tickets in Denmark in English language, English speaker from nearby European country. I, Finnish, buy tickets in English in Denmark. Ties to browser locale, not system week information. We could focus on fixing that first. Always a risk of
… solving a problem for users who think of Monday-starting-weeks, but somewhere on the stack is a US datepicker. But for users who (maybe from Japan) think of weeks starting on Sunday. Fingerprinting that thinks there is one bit to flip. The week is the worst thing. Anecdotally, an Aussie user doesn't care. But for me it's the hardest problem,
… even though I know. But we shouldn't give this problem to Japanese users.
eemeli: Specifically on the JS date picker.
xfq: Currently the HTML standard recommends the user agent can customize date picker elements using the language. Browsers don't implement that, so it uses the browser UI language. We have an issue tracking to change that. People would like to integrate i18n status of date picker with whole page. Currently is separate. If page is in different
… language from user agent. Not implemented, despite that non-normative spec note.
hsivonen: but again, the date picker still leads to expensive, misbooked travel and train tickets
… johanneswilm, xfq: tell the browser implementors to implement the thing
hsivonen: disagrees. The browser date picker should use the (system?) first day of week setting.
johanneswilm: company website, one datepicker look and feel for the whole company. I start going shopping, buying plane tickets for personal use, use my locale datepicker. I currently have en-CA to get km for uber.
… I would love en-zz. I would like to have the same datepicker as rest of my organization.
hsivonen: the organization thing is tricky. One reason users use en-US outside US. You have people in your org who don't read the local language and people who do but also read English. The IT department forces en-US on issued laptops.
<xfq> re system locale prefvs node language, we're having a kind of similar thing in the payment specs as well, it is awkward if the message the merchant supplies doesn't match the language (and amount / currency display) in the sheet
hsivonen: But you would want different units. The IT department should be able force another detail (units?)?
… (thanks)
johanneswilm: as long as the browser is not following the org locale so there is not miscommunication within the organization. But for the external page. They would commercially let the users choose the datepicker.
snek: About the company that wants to control the locale. Can they not just implement a custom thing.
johanneswilm: no. React components do really get complicated. I've had to develop several. It'd be nice if the browser one worked.
eemeli: that sounds interesting but potentially separate from en-zz work.
johanneswilm: I don't want en-zz to block the lang attribute on the date picker.
robwu: why not en-ww for "worldwide"
eemeli: "zz" already means "unknown region"
???: what if we really want unknown english
<xfq> there is no 'ww' subtag in the IANA registry
hsivonen: "ww" is in the unused space and we should not assign meaning to it.
eemeli: "ww" is unassigned
… "ww" only means worldwide in English
eemeli: pet peeve: what about off planet colonies?!? "zz" is unknown, not limited to Earth. Won't someone think of the Martians
hsivonen: stay on Earth
… (sorry)
hsivonen: consider who manages this namespace. Unknown region is in the ISO namespace and the justification is the en component on its own, the likely subtag operation -> US, and zz should inherit most stuff from 001.
… If we were to do en-zz we would need to put zz in the US english paradigm. The en-GB paradigm was listed separately. en-IN was not. But due to a bug report, unenumerated regions are assumed to be in en-GB's paradigm. For this to work with CLDR we'd need to put zz in the en-US paradigm. If we didn't do "zz" we'd have to be in the private use
… space; not impossible; but if we want to bikeshed what's there we should consider private use identifiers without existing CLDR semantics.
johanneswilm: If this really takes off, maybe the status of the US is different. We want the standard English, without "y'all." We can change our definition of standard English separate from US English.
hsivonen: we don't need to plan ahead for that. We can mint new identifiers as needed. en-zz relating to en-US, the overriding reason: people who are not to the US system of measures use en-US as the UI language disproportionately empirically.
<michaelficarra> maybe Americans just do a lot of international travel 🤔
hsivonen: We change just the units that trigger on region. We're not trying to change anyone's English UI language. Any other opinion on the "best English" is irrelevant, as we just are trying to change measures.
benoit: can we reserve en-zz?
eemeli: this is not the only use of "zz" so probably not. It's not widely used, but there are other uses. The meaning we assign might end up taking over as de facto.
hsivonen: what needs patching in CLDR for this to work? Fairly small changes. A good sign.
… If we have a browser that exposes this? What do websites do? We need to run that experiment to see if it breaks the web.
… But no need to involve the ISO assignment authority by minting something like "ww".
… Browsers use CLDR. Websites and JavaScript use the same data.
… Meaning in CLDR should guide us.
<benoit> would it be better to prefer "unspecified" instead of "unknown" ? i.e. "English with an unspecified region" (where "region" is not necessarily the right word there)
<michaelficarra> eemeli hasn't met a Scot
<benoit> LOL
eemeli: not thoroughly thought out: re "y'all" concern about US english. We could phrase it as "source English" plus standard measures. English variants are mutually intelligble. Not terrible if en-GB was served as en-zz content. That would be different what we have currently (in the explainer?).
???: What about source other languages.
eemeli: only looking at en-zz right now.
<benoit> fr-ZZ could start a war
eemeli: Fingerprintability is a concern, but not for en-US. E.g. for nl-zz there is a more of a risk.
<eemeli> ack
hsivonen: everything other than en-US is a distraction. Users just do not use other languages out of locale to anything like the same degree.
johanneswilm: support eemeli. A car company or phone company (eg from China) banned in the US might pick another default English. They might not have an en-US translation. Maybe just have a note explaining why it's the source language.
… If en-GB or en-IN as their source language. No substantial disagreement.
… A non-normative note would do.
hsivonen: Nokia did use en-GB as their source English. No longer. Users wanting to use en-US to get the untranslated form is one known reason why users choose to use en-US, but this addresses the general observed behavior that many users use en-US outside the U.S. context without getting too focused on any one of the multiple reasons.