Internationalization Working Group Teleconference -- 21 Jun 2018

<addison> trackbot, prepare teleconference

<addison> ScribeNick: addison

Dial in info: https://www.w3.org/2017/09/i18n-meeting-info.html

that link is member-only

Action Items

https://www.w3.org/International/track/actions/open

action-726?

<trackbot> action-726 -- Richard Ishida to File utf-8 issue on html53 -- due 2018-06-07 -- OPEN

<trackbot> https://www.w3.org/International/track/actions/726

close action-726

<trackbot> Closed action-726.

action-727?

<trackbot> action-727 -- Addison Phillips to Invite stakeholders for html issue 1424 to next teleconference and say that we are doing this on the issue itself -- due 2018-06-21 -- OPEN

<trackbot> https://www.w3.org/International/track/actions/727

close action-727

<trackbot> Closed action-727.

HTML Issue 1424

<stpeter> scribenick: stpeter

addison: we have a few issues on HTML 5

<addison> https://github.com/w3c/html/issues/1424

addison: can someone introduce this?

chaals: in HTML we try to put in things that are interoperable and deployed
... the pairing model for ruby is only implemented in Firefox as far as we know
... if there are no other implementations we would remove from the Recommendation version
... however we understand that there might be more implementations now or on the way
... also want to clear up the text that we have
... it's not clear how the various ruby features are differentiated

fantasai: some of this is stylistic

r12a: you don't need styling for the mono/group ruby distinction, just markup; for jukugo ruby you need styling rather than markup

chaals: is there a CSS thing that explains the different renderings?

<r12a> https://www.w3.org/International/articles/ruby/markup

fantasai: some of this needs to be explained in the HTML spec because people aren't used to thinking about ruby in semantic terms
... different markups are needed for different renderings

<r12a> https://www.w3.org/International/articles/ruby/markup#mono_group_etc

chaals: I'm preparing a pull request and will ask for feedback

fantasai: I wrote a blog post on this

r12a: there are examples in the last URL I posted

<xiaoqian> http://fantasai.inkedblade.net/weblog/2011/ruby/

chaals: what I'm finding difficult is that the markup in the current spec is the same for group vs. mono

xiaoqian: thanks for the link

chaals: as I read the issue we have a pairing implementation in Firefox and one in a print project (Antenna House)

fantasai: the Antenna House implementation is used in production

chaals: and we've heard that the Chrome folks are working on an implementation, too
... so that's probably enough to remove the threat of "at risk"
... so I think we can close this issue
... and continue working on a pull request to improve the text
... this would be an editorial cleanup and we'll run that by the i18n wg

addison: we had another HTML-related topic?

issue 1039 in html

<xfq_> https://github.com/w3c/html/issues/1039

chaals: my understanding of the issue is that we need to clear up what we say if you don't use UTF-8 - those other encodings should work in predictable ways

r12a: I would gloss that to say there are some legacy encodings that shouldn't break if the browser follows the spec, and encodings that are not covered by the Encoding spec if used will lead to problems - but you shouldn't actually use any legacy encodings, only utf-8

addison: and you really shouldn't use some of those :-)

chaals: best to say "must not generate an encoding other than UTF-8" but if you receive other encodings you should handle them appropriately

addison: historically, pages without an explicit encoding are not in UTF-8
... assuming that those are in UTF-8 would break things
... there are also locale-based vagaries

chaals: so it sounds like we need to do more work on this?

addison: actually it's very specific in the spec
... all of this machinery hangs together
... what's weird is to say "must UTF-8" but that's not the default encoding

r12a: this is the most practical way even though it's strange

<JcK> And before it was Windows 252, it was ISO 8859-1. Not worth splitting these hairs as long as the text doesn't make a recommendation equivalent to requiring breaking things.

addison: as guidelines to authors I don't think anyone would object

chaals: we will continue to look at this and check it with you again

addison: wording it carefully so that we don't break things is the goal

chaals: we share that goal, just need to get the right words in the right places

addison: the pitfall is always the summary version, which is all some people might read

Validation of<input> type='date' not conforming to ISO 8601-2 (#3767)

<xfq_> https://github.com/w3c/html/issues/1340

<r12a> https://github.com/whatwg/html/issues/3767

<xfq_> https://github.com/whatwg/html/issues/3767

addison: this one is a sticky wicket, isn't it? and it has additional messiness because data types don't deal with unknown date fields (milliseconds since epoch)
... blank fields not supported

Sangwhan: maybe we could delay this

addison: not whether 8601-2 is stable, but what is the effect of taking this on? the side effects are the challenge here

chaals: there is real usage all over the place (e.g., counting miillseconds since epoch)
... lots of bad things will happen on first day of month, last day of month, etc.

r12a: this is not a theorectical problem - people need to enter these date formats

chaals: dates are harder than you think - both non-theoretical and difficult

r12a: maybe have a different date type for birth days?
... perhaps because only certain kinds of dates need this?

addison: might apply to things other than birth dates

fantasai: holidays in general might need this

chaals: a new date type is worth considering, but the challenge is getting people to use them
... for instance some people will leave off the year but include the month or day

Mallory: authors don't know if you're putting in a vague date beforehand

not as scribe, I seem to recall that the vCard specs defined some relevant usages, see for instance https://tools.ietf.org/html/rfc6350#section-4.3.1

chaals: the uncertainty might arise in the middle of a date
... if there's not enough apparent demand to support an uncertain date type, web authors will handle things themselves (e.g., in JS or polyfill)

addison: needs a good deal of thought about how to do this across the web architecture

Sangwhan: could address this in a revision of web forms

Mallory: we've seen people get this front-end validation wrong (e.g., with telephone numbers)

chaals: the short answer is we're probably not going to solve this today

addison: I tend to agree with Sangwhan
... thanks to the HTML folks for joining us today

RTL non-Semitic text

addison: shall we talk about the substantive topic or info share?

r12a: I don't think we'll solve this problem today either

<r12a> https://github.com/w3c/csswg-drafts/issues/2754

r12a: this started with someone saying Chinese is written RTL, how do we do that?
... this got confusing because it was suggested that one could do this with vertical text and one character per column
... this has been suggested before, but I think it's not the right approach
... normally one recommends using BDO to override the direction of the characters which are inherently LTR
... but there's a wrinkle because Latin numbers would go RTL and BDO needs to be applied there as well to make them go LTR (same for Latin-script acronyms etc.)
... problem is that each character has direction - it would be cleaner to change the default direction for a page or a range of characters
... things are clunky now because you can't do what's desirable

addison: might make it script-based, but also want to pull in the common script
... it's complicated

r12a: this is not just Chinese - can see the same thing in other places

addison: a lot of older scripts are not deterministically directional
... this is an interesting problem and sounds like it could use some attention
... might need to educate people that using bidi overrides are not the right tool for the job

JcK: the bidi stuff is instrinsically favoring scripts that are usually or always in one direction or the other

addison: this is a relatively corner case, but perhaps because it's hard to do

JcK: some of the examples adduced are at least a century or two old
... I've heard from colleagues in China that they've mostly thrown up their hands about addressing these corner cases in the computer age

r12a: much of the feedback we receive is for historical scripts

JcK: I wasn't suggesting this problem doesn't need to be solved - my concern is layering even more complexity on top of something that's already complex

fantasai: should this be handled in CSS, not semantics?

addison: this is almost another writing mode

r12a: I think you need to bake in ranges here?

addison: my tendency is to say scripts, not ranges

<fantasai> .special, .special * { unicode-bidi: bidi-override }

<fantasai> .special { direction: rtl; }

<fantasai> ACTION r12a and fantasai to write an article on LTR scripts historically written RTL

<trackbot> Created ACTION-729 - And fantasai to write an article on ltr scripts historically written rtl [on Richard Ishida - due 2018-06-28].

<addison> ACTION: richard: write an article about RTL usage for non-semitic text particularly historic scripts

<trackbot> Created ACTION-730 - Write an article about rtl usage for non-semitic text particularly historic scripts [on Richard Ishida - due 2018-06-28].

<addison> richard: reminds that line breaking article wants comments!

AOB?

<addison> stpeter: TC39 has some i18n stuff, will info share next time

- DRAFT -

Internationalization Working Group Teleconference

21 Jun 2018

Attendees

Contents

Action Items

HTML Issue 1424

issue 1039 in html

Validation of<input> type='date' not conforming to ISO 8601-2 (#3767)

RTL non-Semitic text

AOB?

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output