15:52:43 RRSAgent has joined #i18n 15:52:43 logging to https://www.w3.org/2018/03/08-i18n-irc 15:52:48 Zakim has joined #i18n 15:52:52 trackbot, prepare teleconference 15:52:55 RRSAgent, make logs world 15:52:58 Meeting: Internationalization Working Group Teleconference 15:52:58 Date: 08 March 2018 15:53:06 Chair: Addison Phillips 15:53:17 Agenda: https://lists.w3.org/Archives/Member/member-i18n-core/2018Mar/0000.html 15:53:26 agenda+ Agenda 15:53:30 agenda+ Action Items 15:53:34 agenda+ Info Share 15:53:38 agenda+ Radar 15:53:54 agenda+ What Time is This Meeting At? 15:54:06 agenda+ Recommended characters and possibly RFC5981bis 15:54:18 agenda+ Unicode-XML and Bidi Controls 15:54:39 agenda+ IMSC visiting us! 15:54:51 I have made the request to generate https://www.w3.org/2018/03/08-i18n-minutes.html addison 15:57:57 JcK has joined #i18n 16:00:55 nigel has joined #i18n 16:01:27 present+ JcK 16:02:03 present+ 16:02:05 present+ 16:02:09 present+ Fuqiao 16:02:20 present+ 16:03:20 agenda+ 16:03:26 agenda? 16:03:29 agenda+ Encoding 16:03:35 agenda? 16:03:43 Present+ Nigel 16:04:17 pal has joined #i18n 16:05:38 zakim, take up agendum 1 16:05:38 agendum 1. "Agenda" taken up [from addison] 16:05:41 agenda? 16:05:53 No 16:06:16 zakim, take up agendum 8 16:06:16 agendum 8. "IMSC visiting us!" taken up [from addison] 16:06:43 -> https://github.com/w3c/imsc/issues/236 IMSC Issue 236 16:07:20 scribenick: stpeter 16:09:08 r12a: background ... ISMC uses Unicode characters, glyphs come out of fonts, rendering algos/engines are needed for complex scripts at times before glyphs are assigned; important in this discussion to be clear on terminology of character/codepoint vs. glyphs 16:09:33 JcK: are you talking about single code points or multiple that might result in a single grapheme? 16:09:46 r12a: single code points for this discussion 16:10:31 https://www.w3.org/TR/ttml-imsc1.0.1/#recommended-unicode-code-points-per-language 16:11:19 Katy has joined #i18n 16:12:07 q+ 16:12:10 pal: purpose is to provide guidance regarding subtitles; enhance chance that if author chooses text it will be supported by the user agent and properly rendered 16:12:45 ack addison 16:12:52 pal: the intent is not to disallow certain code points or to require a rendering engine to not render certain code points 16:13:05 addison: I think this is an extremely tricky thing to specify 16:13:26 addison: first, implementers might see this as a required set, the only thing they have to support, etc. 16:14:03 addison: for example, you wouldn't necessarily have enough code points to properly render Arabic 16:14:10 pal: actually we have the common code points 16:14:29 addison: doesn't deal with the need for more glyphs in your font 16:14:40 pal: that's why worded in terms of code points, not glyphs 16:15:02 addison: naive implementation would have glyph per code point 16:15:07 pal: should we add a note about that? 16:15:45 addison: most people build a system there's an instance of it for Arabic users or whatever script is in play 16:15:53 q+ 16:16:15 addison: second point, CLDR has sets of characters like this by language (exemplar sets) 16:16:33 addison: it might be helpful to reference CLDR instead of defining your own 16:17:08 pal: we do reference CLDR - recommended set is a union of CLDR and ??? 16:17:10 ack r12a 16:17:25 r12a: I'm worried about implementers too, but this section is about authors 16:17:46 r12a: my worry is that implementers won't see this as clearly 16:18:42 r12a: make it clear that this is a guide for a minimum set and for real support you should go further 16:18:49 q+ 16:19:24 r12a: also make it clear that implementers need to enable the display of the following sets of characters, not selecting those sets of characters 16:19:33 pal: output document should only contain those characters 16:19:49 addison: output document is displayed somewhere and needs to be displayed faithfully 16:20:01 addison: depends on how system that receives it is implemented 16:20:13 addison: shaping engine etc. 16:20:26 pal: annex is intended to be used by validator implementation 16:20:52 pal: validator that sees a character that's not in the recommended character set can flag a warning 16:20:59 addison: is this really a good idea? 16:21:02 behnam has joined #i18n 16:21:20 pal: what's a bad idea is showing unsupported characters 16:21:59 pal: realistically no implementation is going to support all Unicode code points 16:22:25 addison: some implementations support everything but rather obscure code points (plane 2 Chinese, ancient scripts, etc.) 16:22:31 q+ 16:23:17 addison: what I see happen is trying to legislate fairly narrow character sets, whereas many rendering systems are more capable 16:23:36 pal: this is targeting not just browsers but embedded systems like TVs 16:23:55 pal: also, this has already proved useful 16:24:50 addison: implementers do have font and space limitations, but it's a slippery slope when recommending subsets of characters 16:24:55 ack r12a 16:25:18 r12a: I understand the intent, my concern is in how we describe that to people 16:25:32 r12a: e.g., if we said "these are the safe characters to use" makes more sense to me 16:25:48 q+ To ask what action we can take to address the remaining concerns. 16:26:06 r12a: this comes across as "these are the Hebrew (etc.) characters you should support" but these sets tend to grow to support new code points 16:26:21 pal: this is why we reference CLDR 16:26:49 r12a: unfortunately CLDR is not a panacea - it's missing things 16:26:58 pal: so let's fix CLDR 16:27:00 I have made the request to generate https://www.w3.org/2018/03/08-i18n-minutes.html addison 16:27:17 pal: not displaying a character is way worse 16:27:55 r12a: the crux is specifying a safe set of characters for authors without implying that implementers should limit the sets of characters they support 16:28:06 q+ 16:28:15 pal: what about starting the annex with that text? 16:28:25 r12a: that's the kind of thing I was looking for 16:28:49 q+ 16:28:50 ack nigel 16:28:51 nigel, you wanted to ask what action we can take to address the remaining concerns. 16:29:27 nigel: the struggle here is understanding exactly what the concern is and coming up with a proposal to address the concern 16:29:34 nigel: this discussion is helping 16:29:43 nigel: any other concerns we can surface here? 16:29:45 ack JcK 16:31:00 JcK: I'm concerned about where this might be leading; displaying the wrong character is much worse than displaying parts of a string and not other parts (for instance) 16:31:20 JcK: part of the concern is that there are many edge cases which can't be handled by this kind of approach 16:32:19 JcK: e.g., if you get text in Hebrew script but another language then you might not have the right code points to display things properly 16:32:56 ack pal 16:32:56 JcK: there are traps here about writing this particular language with this particular script, but not other languages 16:33:33 pal: I captured another concern earlier about cautioning implementers that one code point != one glyph 16:34:08 r12a: if you're dealing with a complex script like Myanmar, there are more difficulties 16:34:40 addison: when people go font shopping, they can be satisfied with an inferior font and the rendering engine doesn't have the glyph that's necessary 16:34:48 pal: that's true regardless 16:35:36 q+ 16:35:37 r12a: that's part of my concern - we shouldn't let implementers off the hook and stymie forward progress (yes, these are embedded systems that aren't updated often) 16:35:48 pal: hard to phrase this in a technical document 16:36:09 q+ 16:36:16 addison: these things tend to ossify into a lowest common denominator or institutionalizes some particular set of characters 16:36:26 ack addison 16:36:54 pal: I think we're safe in the sense that systems support all of Unicode - we're not trying to create a chokepoint for code points 16:37:14 addison: not at document level but at the validator and authoring tool levels 16:37:28 pal: that's why we don't reference a particular version of CLDR for instance 16:37:34 ack Katy 16:38:04 JcK: the fact that CLDR exists does not imply that CLDR is correct 16:38:34 Katy: even defining a list of safe characters can vary quite wildly 16:39:03 Katy: to clarify, managing author expectations is difficult here 16:39:31 Katy: not just glyph display but processing and the like 16:40:46 nigel: maybe clarify for authors that you can't just get a glyph but there is more complexity - there might fallback fonts and such (not just safe characters) 16:40:54 nigel: is there a document we can reference? 16:41:00 q+ 16:41:16 nigel: an informative document about rendering different characters correctly? 16:41:19 ack nigel 16:41:22 q+ 16:41:56 addison: a different place to look might be the various font standards, which have introduced language codes that are supported 16:42:10 I heard r12a and katy express support for adding a note to explain that correct rendering of scripts goes beyond mapping code points to glyphs in a font 16:42:11 addison: there might be standardization there to look at - a different way of accomplishing the goal here 16:42:14 ack addison 16:42:17 ack r12a 16:42:21 q+ 16:43:01 r12a: two questions: (1) the safe list here is presumably based on lowest common denominator for various devices? 16:43:17 pal: tables were built using a study of TV and motion picture content 16:43:34 pal: collecting all code points that were used in that context 16:43:45 r12a: (2) why are we not just referencing CLDR? 16:44:23 pal: there are longstanding issue against CLDR to add flag for text commonly appearing in subtitles 16:44:23 q? 16:44:34 q+ 16:45:09 q+ to note that ossification is not a feature of the list of characters but a wider issue 16:45:09 r12a: I think what would help is to add some text cautioning against ossification 16:45:52 pal; [summarizes feedback received so far] 16:45:58 ack pal 16:46:03 s/pal;/pal:/ 16:46:37 ack stpeter 16:46:37 pal: we can try to formulate text along those lines and come back for further feedback 16:47:30 stpeter: why not attack the problem at the CLDR level if they aren't properly supporting text needed in subtitles? 16:48:05 q+ 16:48:11 pal: everyone's goal is to move this to CLDR 16:48:18 addison: we'd be happy to support that as well 16:48:42 addison: we do have a liaison agreement 16:49:33 pal: subtitles and captions are becoming a global requirement and there are unique needs here; great example is musical note character 16:49:34 ack nigel 16:49:34 nigel, you wanted to note that ossification is not a feature of the list of characters but a wider issue 16:50:04 nigel: this point about ossification is a tricky one; e.g., if you deploy player code to a device, updates might not be available 16:50:39 nigel: e.g., a downloadable font could be possible, but more work is needed to support the right characters 16:50:45 nigel: how do we phrase this? 16:50:54 addison: good question 16:50:55 ack r12a 16:51:01 https://github.com/w3c/imsc/issues/236#issuecomment-367713408 16:51:44 r12a: that link has some suggested text but it might not be exactly what we need here - encourage folks to re-read 16:52:02 pal: I'll try to craft text based on the terms we used in this call today 16:52:22 addison: would you like us to say something to the CLDR folks? 16:52:26 pal: +1 16:52:29 nigel: +1 16:52:50 pal: I plan to propose text soon for review by folks here 16:53:18 addison: any concerns about supporting the CLDR trac? 16:54:29 JcK: I'm nervous because it would be great to get down to one standard instead of two; at the same time, CLDR has been criticized for being opaque to folks with actual language expertise and not just character coding expertise 16:55:00 addison: I'll take an action to focus it on the issue at hand 16:55:11 action: addison: write to cldr on WG behalf about Trac 8915 including wording about getting exemplars right 16:55:13 Created ACTION-699 - Write to cldr on wg behalf about trac 8915 including wording about getting exemplars right [on Addison Phillips - due 2018-03-15]. 16:55:36 pal: I will let you know when the proposed text is ready 16:55:54 action: addison: make pal's new draft part of homework 16:55:56 Created ACTION-700 - Make pal's new draft part of homework [on Addison Phillips - due 2018-03-15]. 16:56:08 addison: anything else on this topic? 16:56:27 agenda? 16:57:01 zakim, take up agendum 5 16:57:02 agendum 5. "What Time is This Meeting At?" taken up [from addison] 16:57:28 +1 16:57:41 r12a: typically don't change time until UK changes to Summer Time 16:57:58 addison: in favor 16:58:07 (So no change for me then? That's good :-) ) 16:59:08 rrsagent, draft minutes 16:59:08 I have made the request to generate https://www.w3.org/2018/03/08-i18n-minutes.html r12a 16:59:11 I have made the request to generate https://www.w3.org/2018/03/08-i18n-minutes.html addison 16:59:19 zakim, who is here? 16:59:19 Present: JcK, Bert, addison, Fuqiao, stpeter, Nigel 16:59:21 On IRC I see nigel, JcK, Zakim, RRSAgent, addison, xfq, stpeter, r12a, koji, bigbluehat, sangwhan, fantasai, dbaron, trackbot, Bert 16:59:24 present+ pal 16:59:33 present+ Katy 16:59:46 I have made the request to generate https://www.w3.org/2018/03/08-i18n-minutes.html addison 17:00:57 rrsagent, draft minutes v2 17:00:57 I have made the request to generate https://www.w3.org/2018/03/08-i18n-minutes.html r12a 17:02:03 s/ No// 17:02:14 s/ trackbot, prepare teleconference// 17:02:20 rrsagent, draft minutes v2 17:02:20 I have made the request to generate https://www.w3.org/2018/03/08-i18n-minutes.html r12a 17:03:40 zakim, bye 17:03:40 leaving. As of this point the attendees have been JcK, Bert, addison, Fuqiao, stpeter, Nigel, pal, Katy 17:03:40 Zakim has left #i18n 17:25:17 stpeter has joined #i18n 18:08:12 stpeter has joined #i18n 18:12:43 stpeter has joined #i18n 21:08:37 stpeter has joined #i18n 22:52:11 stpeter has joined #i18n