Internationalization Working Group Teleconference

19 Dec 2013


See also: IRC log


Addison, Richard, Felix, Mati, David
Addison Phillips
Addison Phillips



Action Items

Info Share

Next Multilingual Web Conference Announced


richard: can register now, confirmations next year
... don't forget to look at the ebooks list
... which is here:

<r12a> public-digipub@w3.org

richard: there are interesting discussions over there and we might contribute
... recent discussion about drop caps
... and a "Latin" layout requirements doc
... moves afoot to create a chinese layout requirements document
... early days, but may become a group
... we might take drafty indic document and help that along?
... needs an assist to move forwards
... great if we could get more Arabic speakers involved in writing layout requirements???

<fsasaki> http://www.w3.org/community/ld4lt/

felix: Linked Data for Language Technology

<scribe> ... new group working on this

UNKNOWN_SPEAKER: combine different kinds of multilingual resources
... some interest at IUC in stuff like named entity recognition (NER)

felix: site redesign for ML site

<fsasaki> http://www.multilingualweb.eu/projects/

felix: vairous projects listed that come out of Multilingual Web
... not all are active, particularly community groups
... but nice to see different things that are active

richard: MLW pages now under International on w3 site
... becoming more closely integrated

felix: important for sustainability

CSS Syntax

richard: looked at encoding recognization and such
... rely on encoding spec
... don't reference currently
... not sure what they'll do when they reference


http://www.w3.org/TR/css-syntax-3/#charset-rule The note in this section contains this text: -- where XXX is a sequence of bytes other than 22 (ASCII for ") -- This is unclear and looks odd. Please use hex notation and also use the name of the character in question. E.g.: ... where XXX is a sequence of bytes other than 0x22 (the ASCII character " U+0022 QUOTATION MARK)... Related Actions Items:


<r12a> http://encoding.spec.whatwg.org/#decode

richard: assumes that you have an alternative if you don't have a BOM

addison: maybe that section needs more work? that handles http charset, but not sniffing the file?
... doesn't XML do a reasonable job here?



addison: you version is clearer


Editorial: To make it clearer which are the steps referred to by "follow these steps", put the para starting with "First" and that starting with "Then" into an ordered list.

<r12a> http://dev.w3.org/csswg/css-syntax/#charset-rule

richard: there is a section on the charset rule
... statement at the end

<r12a> "The @charset rule has no effect on a stylesheet. "

richard: that's kind of weird

addison: it's not really clear what that means

CSS Text

<r12a> http://www.w3.org/International/track/products/15



richard: read my version carefully: it took a long tinme to write

"extending the rules in a language-specific way for how the grapheme cluster is formed"

scribe: but not changing what is defined to be a grapheme cluster

richard: syllable in deva is not a grapheme cluster by any definition at all
... may be 2 or more graphemes in a syllable
... make grapheme cluster mean the same thing everywhere and mean Unicode clusters
... use technical terms in the defined way

"apply additional rules for the selected text (beyond just grapheme clusters) when applying a given CSS Text operation"


This is not true where group ruby is concerned. The intercharacter breaks that would be allowed in the base text are not allowed within a run of ideographic characters that are spanned by a single ruby text element.




I think that, for the sake of interoperability, the CSS spec should require the use of UAX14 as a default for line breaking behaviour. It should also state that the rules in UAX14 may need tailoring for certain scripts, and that the properties specified in this section assist the user in controlling line breaking behaviour.

Text in the spec such as the definition of word-break: normal, which says "Words break according to their usual rules", would then provide a little more guidance to the implementor.


richard: only a small part of (CJK) line-breaking
... also, it doesn't really tell you how to do line breaking (or justification)





"Implementers are expected, to the extent possible, to make available appropriate justification behaviours based on the language of the paragraph e.g. character-dependent expansion rules for Japanese, using cursive elongation for Arabic, using ‘inter-word’ for English, keeping typographic syllables together in complex scripts, etc. Only where such linguistic tailorings have not yet been implemented should the browser use a justification method that i[CUT]



addison: always used in combination with inter-word

richard: took to mean 'auto' if you want indic stuff, inter-word and distribute are "forced"
... send comment as is? but then develop over email

addison: "this justification thing needs more work: work on what it selects/applies to and then how it is used"?


<matial> I need to leave. Bye.


richard: raised comment on WebVTT

addison: that's good
... trackered?

Next Call: 9 January

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.138 (CVS log)
$Date: 2014-01-14 17:48:40 $