2018 TPAC - Internationalization (I18N) WG

22 Oct 2018



(no, one), r12a, addison, xfq, david, bert, himorin
Addison Phillips
addison, addison_, David_Clarke, Bert


<addison> Agenda https://www.w3.org/International/wiki/2018TPACAgenda

<addison> agendabot, bye

<addison> https://lists.w3.org/Archives/Public/public-i18n-core/2018JulSep/0050.html

Agenda and Intro

South East Asian scripts

<addison> (self intros)

<addison> scribenick: addison

angel: for scripts used in vietnam thailand

when combined into pictures on website

scribe: doesn't present correctly
... looking for an example to show
... is it a website design problem or i18n problem?
... looking to gather specific problme statement

<angel> https://www.lazada.co.th/#hp-most-popular

<angel> https://www.lazada.vn/#hp-just-for-you

richard: could send email to public-i18n-core@
... have preliminary look
... can suggest as an issue in SEALreq repo
... have 20 experts (mostly Lao, Khmer, etc.)
... currently seeding list with questions based on gap analysis
... do you put spaces betwee characters and such
... had some useful discussion
... need real world issues
... what you can do as alibaba, can follow sealreq repo
... if have experts in thai script or sea scripts
... want to contribute ina practical way, please let me know so I can add them
... two ways to participate
... can follow or can contribute


<r12a> https://github.com/w3c/sealreq

<r12a> https://pages.lazada.co.th/wow/i/th/LandingPage/lp-collection-detail-new?spm=a2o4m.home.collections.8.41df719c5lCJmw&wh_weex=true&wx_navbar_transparent=true&abtest=&pos=7&abbucket=&clickTrackInfo=1007.17383.112843.100200300000000_7f4aebb8-dccc-4885-84ca-07b711da2a46_0_null_th_2622_6%3A0_864168_9_0.0__1_7470684_9_0.0__2_146056181_9_0.0&themeID=2622&acm=icms-zebra-5000381-2586223.1003.1.2262791&algArgs=2622_864168-7470684-146056181&scm=1007.17383.112843

<r12a> .100200300000000

richard: (notes spaces in thai page looks odd)

<r12a> https://github.com/w3c/sealreq

richard: all of these task forces are under the interest group
... working group publishes documents and review what they are doing
... two ways of collab.
... follow group
... best way is to join public-i18n-sea
... get notifications once per day of changes in this repo
... and also in css, html, ttml, and a few more that are *related*
... that's keeping up
... if you *are* an expert, you caon contribute
... add you to group, to core group that's editing
... have to agree to set of principals
... links in page ("background info")
... that's sea
... also arabic/persian, indic, hebrew (needs help)
... ethiopic
... chinese, japanese
... recently reopened JLReq
... advanced publishing lab helping that
... korean/hangul have docs but no TF
... just started mongolian tf
... have a document!
... tibetan (but nothing happening)
... can create where there is a need
... explain self-review


scribe: down here sectio on reviews

(much clicking)

scribe: task based headings
... get do/don'ts
... click on more to read more
... basic idea that you go through and apply to your spec

richard: (reads fine print at bottom)
... links to right place
... can generate a checklist to put into a github issue
... derived from bp-spec-dev

<r12a> i18n WG moves to Webplat WG meeting

<sangwhan> "korean/hangul have docs but no TF" <- Want me to bang some cans to see if anyone is interested? What exactly is this about?

<addison_> (Break until 11:15)

<addison_> (we attended #webplat)

ECMA-402 'Intl' presentation

<addison_> ScribeNick: addison_

littledan: work on the javascript standards cmte TC39 in ECMA
... produce different specs
... the main js ECMA-262 spec
... ECMA-402 is the i18n spec
... the only two main specs
... functions to provide formatting
... for dates and numbers
... avail on web and other contexts like node.js
... if you have a date
... get a representation of date based on locale.
... secodn spec by TC-39
... use test262 for testing
... numbered stages
... have a call, 2 hours per month
... discuss proposals and impl status
... all aspects of design
... github repo
... contact date for invite
... original spec
... shipping in all browsers
... from IE11
... numberformat
... datetimeformat
... options

<addison> scribenick: addison

littledan: formatToParts
... chrome, ff, but not edge/wk



Meeting with Publishing (see #pwg)

time 1300 to 1430?

record of the PWG discussion: https://www.w3.org/2018/10/22-pwg-irc#T12-46-26


<littledan> https://github.com/whatwg/html/pull/3047

<littledan> https://github.com/whatwg/html/pull/3046

(discussion of 3047)

RESOLUTION: support 3047

<scribe> ACTION: addison: post a comment on ECMA-402 #3047 supporting position

<trackbot> Created ACTION-759 - Post a comment on ecma-402 #3047 supporting position [on Addison Phillips - due 2018-10-29].



<scribe> ACTION: addison: add personal opinion about "settings" serialization in ECMA-402 #3046

<trackbot> Created ACTION-760 - Add personal opinion about "settings" serialization in ecma-402 #3046 [on Addison Phillips - due 2018-10-29].

<r12a> scribenick: David_Clarke

dan: No point in discussing the decisions that are already implemented

<Bert> scribenick: Bert

littledan: Idea is to make these basic things and built message formatting on top of it.

addison: I teach developers the cases "0", "1", "2" and "3".
... "Equal-zero" rule.
... Your API is useful if I don't have an "equal-zero" rule.
... But it is only partial.
... Avoid that develops distinguish just: if a== 1 then... else...

r12a: Not sure I understand the API fully.
... Can you have pr.select(3), e.g.,

addison: In this API, the token "other" means that it is not "one". It's a token, not a word.
... In russian there are different words to use for referring to one, few and many things, switches around 20.

r12a: How do you do the substitutions with this API?

addison: I'll have a demo later.

<addison> https://www.unicode.org/cldr/charts/latest/summary/en.html#1567

littledan: These APIs at Stage 3 are about to ship. RelativeTimeFormat takes a positive or negative number (number of units in the past).
... It is a low-level API.
... Avoids many bug reports.

addison: RelativeTimeFormat(100, "day") returns "in 100 days". Does it apply the correct plural rules? E.g., for Russian?
... How often will you use this API? Might be OK for a single label, but can it be used to compose a longer message?

littledan: Something like "yesterday" [...]

addison: When does "yesterday" start?

littledan: That's not part of this API. If you pass -1, it will return "yesterday".
... The Intl.Locale interface:
... Is also in Stage 3.

addison: About BCP 47.

littledan: There are "x-" tags and grandfathered tags.
... You think we should canonicalize them and replace those that can't by "unkn"?

addison: Most of them canonicalize to a real tag.
... Only 5 or 6 don't.
... As long as the tag is valid, you should accept it. What you do after, is up to you.

littledan: Canonicalization rules are not written down anywhere. We should write rules for those 5 or 6.

addison: Should be added to Unicode spec.
... Do you have a list of them?

<addison> ACTION: addison: file a bug on CLDR to add remaining grandfathered tags to canonical mapping rules in UTS#35

<trackbot> Created ACTION-761 - File a bug on cldr to add remaining grandfathered tags to canonical mapping rules in uts#35 [on Addison Phillips - due 2018-10-29].

addison: Any other corner cases?

littledan: That was the big one. Another is whether to validate the tags.
... It may be optional.

addison: Well-formed vs valid.

littledan: We do check well-formed.

addison: A well-formed, invalid tag may be accepted but doesn't do anything.
... Registry maintenance issue.
... We have languge sub tag rules that a tag that was ever valid will never be re-used for something else.
... There is an ongoing registry for subtags. There are four or five every year.

Addison gives some examples: valencian, cornish...

david: Cornish is a dead language, but people may resurrect it from transcribed usages.

addison: There is no locale data for nearly all of these subtags.

<addison> abc-defg-hi-jk39393

addison: What does "valid" mean in that context? Should implementation track the registry?
... ^^ is a well-formed tag, but I think none of the subtags mean anything.
... What do you do with it?

littledan: Current impls fall back to browser's local locale.

addison: There is no control over the fallback?

littledan: No, there isn't.

addison: On the Web it is always interesting because you have a machine, a browser and a Web page; which determines the locale?
... E.g., on amazon.de, the language falls back to German, because it's the German site. If the user is polish, but there is no polish version, falling back to German on that site makes sense.

littledan: Based on inherited language?

addison: Yes, pretty much.

littledan: That seems like a reasonable feature request. I can talk to HTML people about that.
... What about forms?

addison: Page author can control the language.
... It is defined somewhere in HTML.

littledan: Intl.ListFormat interface:

No new comments.

littledan: Intl.Segmenter interface:

Discussion about the names of the fields.

littledan: .segment returns an iterator.

Discussion about UTF-16 and indexing in strings in linear time.

littledan: The segmenter's algorithm is unspecified.
... Implementations can do different things.
... There is a note that mentions recommended Unicode reports.
... Doesn't need to match CSS algorithm either.

addison: Breaking at words not so critical. Harder is breaking at grapheme clusters.

littledan: We could block the spec on getting more clarity.

addison: W3C policy is to have tests.

littledan: ECMA has procedures. Test can have metadata that makes it conditional.
... No normative references to CLDR, so this process is the only way to make tests.

addison: There may be a lot of interesting corner cases where interop is hard to achieve.
... E.g., we made a device with pretty good Japanese word-based selection. A lot of effort. It does better than desktop browsers.

The APIs work on strings, not on DOM objects.

littledan: Intl.NumberFormat options.

addison: I think there are some gaps in these options. What is the proposal? Where do these options go?

littledan: Extra options in the number format options.

addison: How do you add the units?

littledan: There is a style "units" and another option to set what the unit is.
... There is MB and s, but not MB/s. Our approach is to ask CLDR to ask units, if we find they are important.

addison: ... furlongs/fortnight...

littledan: Intl.DateTimeFormat.prototype.formatRange

addison: Seems more the domai of UX designers...

littledan: It has a method ..toParts to split it up in parts, to play with individually.

Atsushi: Not sure we want to use this for automatic translation in Japanese or Chinese.

addison: For other languages, too.

Atsushi: Japanese format for era.

Addison demoes a calendar app, with a switch to Japanese.

littledan: Any comments on the Temporal proposal?

addison: It is a known problem that tme stamps in HTML have an offset, not a time zone.

Discussion about leap seconds. Whether "60" can be a value for seconds.

addison: Not sure if the value with a time zone also needs the offset block.

littledan: Can you submit a comment for that?

What should be the normative reference? ISO 8601?

addison: Instead of time formats, I'd recommend skeletons.

littledan: And Intl.DisplayNames?

addison: Is this a separate spec?

DisplayNames provides lists of names of languages, territories, etc.

littledan: There is no API defined for it yet. Still under discussion. Looking at the use cases.

<addison> ACTION: addison: comment on temporal and 3339 etc etc in ECMA-402

<trackbot> Created ACTION-762 - Comment on temporal and 3339 etc etc in ecma-402 [on Addison Phillips - due 2018-10-29].

Summary of Action Items

[NEW] ACTION: addison: add personal opinion about "settings" serialization in ECMA-402 #3046
[NEW] ACTION: addison: comment on temporal and 3339 etc etc in ECMA-402
[NEW] ACTION: addison: file a bug on CLDR to add remaining grandfathered tags to canonical mapping rules in UTS#35
[NEW] ACTION: addison: post a comment on ECMA-402 #3047 supporting position

Summary of Resolutions

  1. support 3047
[End of minutes]

Minutes manually created (not a transcript), formatted by David Booth's scribe.perl version 1.154 (CVS log)
$Date: 2018/10/22 15:29:22 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.154  of Date: 2018/09/25 16:35:56  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: Irssi_ISO8601_Log_Text_Format (score 1.00)

Succeeded: s/ECMA-402 is/... ECMA-402 is/
Succeeded: s/canocaliize/canonicalize/
Succeeded: s/my reusrrect/may resurrect/
Succeeded: s/graphemes/grapheme clusters/
Succeeded: s/@@/Atsushi/
Succeeded: s/@@/Atsushi/
Present: (no one) r12a addison xfq david bert himorin
Found ScribeNick: addison
Found ScribeNick: addison_
Found ScribeNick: addison
Found ScribeNick: David_Clarke
Found ScribeNick: Bert
Inferring Scribes: addison, addison_, David_Clarke, Bert
Scribes: addison, addison_, David_Clarke, Bert
ScribeNicks: addison, addison_, David_Clarke, Bert
Agenda: https://www.w3.org/International/wiki/2018TPACAgenda

WARNING: No date found!  Assuming today.  (Hint: Specify
the W3C IRC log URL, and the date will be determined from that.)
Or specify the date like this:
<dbooth> Date: 12 Sep 2002

People with action items: addison

WARNING: IRC log location not specified!  (You can ignore this 
warning if you do not want the generated minutes to contain 
a link to the original IRC log.)

[End of scribe.perl diagnostic output]