W3C

– DRAFT –
I18N ⇔ CSS

25 April 2023

Attendees

Present
Addison, Fantasai, Florian, Fuqiao, r12a
Regrets
-
Chair
addison
Scribe
fantasai, xfq_

Meeting minutes

https://www.w3.org/International/track/actions/open

florian: would like to add Korean line breaking to the agenda

Action items

addison: any progress on the action items?

<addison> action-1222?

<trackbot> action-1222 Elika Etemad to With florian triage richard's article into a list of potential generics -- due 2023-01-16 -- OPEN

florian: not from me

<addison> action-1223?

<trackbot> action-1223 Florian Rivoal to Triage all css properties to determine which are logical, physical, or na by default -- due 2023-01-16 -- OPEN

<addison> action-1230?

<trackbot> action-1230 Florian Rivoal to Make sure generics are comfortable to read in the content lanuage -- due 2023-01-24 -- OPEN

<addison> (addison to clarify 1230)

ACTION: addison: follow up on custom property namespacing

<trackbot> Created ACTION-1264 - Follow up on custom property namespacing [on Addison Phillips - due 2023-05-02].

w3c/csswg-drafts#7129

<florian> w3c/csswg-drafts#4285

<florian> w3c/i18n-discuss#11

florian: Korean has two ways to break lines
… one like English, one like CJ
… CJ styling is more traditional
… they're used in different contexts
… this makes Korean somewhat unique
… if all the content is language tagged properly, all desirable behavior is available
… the default mode does the old fashion
… the new style is not everywhere
… but it does exist
… we could just tell the authors to tag the content with the right language
… the word-break: keep-all is suitable
… we could have a new keyword: auto-for-every-language-but-do-different-things-for-korean
… I have an action to see if it's unique
… i.e., if it exists in other languages
… ethiopic
… old fashion of breaking everywhere
… the state of the technology is different
… afaik Korean is the only language in the world in this situation

addison: this is mostly concerned about Hangul breaking

<fantasai> word-break: keep-hangul

<fantasai> florian: Although I said modern/new vs traditional/old, both styles are in use today

fantasai: kind of like vertical vs horizontal in japanese

florian: to a degree, yes
… either get the csswg to completely reject this idea
… if you go to klreq it has a brief sentence saying that both practices exist
… this is not a thing that needs generalization into 25 languages
… that's the problem statement
… it's indeed not a general problem, this is a specialized one

ACTION: addison: follow up with richard about i18n-discuss#11

<trackbot> Created ACTION-1265 - Follow up with richard about i18n-discuss#11 [on Addison Phillips - due 2023-05-02].

addison: I'll follow up with r12a

<fantasai> Looking for confirmation on 2 things:

addison: and the various Korean folks

<fantasai> 1. Need for this new breaking behavior

florian: I have a clear sense that they're both used

<fantasai> 2. No need for similar switches for other writing systems

addison: it's unclear what the most people expect it to be

florian: it is a thing people do here and there

fantasai: I'd imagine CJ behavior is more common with justification, since it reduces the amount of space that needs to be introduced to justify the line

addison: I'll double check the Ethiopic script with r12a

florian: I think it is less a problem in ethiopic script, but I may be wrong
… it's a matter of how you choose the default

<fantasai> florian clarified that while both behaviors are well supported in CSS as opt-in, and defaults are not really changeable given compat, the question is how to handle unknown/untagged content

addison: we're talking about Korean line breaking, traditional syllable breaking
… and modern line breaking

florian: I'm looking for either support, or an explanation of not supporting it

addison: will that require page authors everywhere to invoke this sort of special mode?

florian: people can add it to css resets
… Korean authors might accept comments in other languages
… keep the value only for Korean

r12a: for Ethiopic my understanding the older orthography is not defunct yet
… not every person uses space these days

florian: my understanding is that the traditional writing style is still alive, but you have to deliberately wanted it
… I would like to know if Korean is the only writing system in this situation

r12a: how does the Unicode line breaking properties do?

florian: I'm been working with Bloomberg, they have English/Japanese/Korean etc., and they want the users to see the expect behaviour, however, they don't know what the users would type

<fantasai> If we need something more generic, it can be keep(<scriptname>) / break(<scriptname>) where these are diffs against 'normal'.

florian: keep-word for Korean, but not for anything else

<fantasai> But if we don't need it to be generic, keep-hangul seems fine

florian: if it is true Korean is in this narrow space, it is easy to spec and implement, and it's trickier to find a name

fantasai: the new behaviour is basically 'normal' except that Hangul is reclassified to behave like Latin
… the other values we have now is to keep all the letters like latin, or break all letters like CJK
… I'm not entirely convinced that we need something generic, but if we did, we could have keep(<scriptname>+) which reclassifies that script to behave like Latin; and break(<scriptname>+) which reclassifies that script to behave like CJK
… we can introduce 'keep-hangul' and if needed 'break-ethiopic'

addison: the key thing is to be able to set default without language tagging

florian: you can already do that

florian: people can add <span class=hangul> and use JavaScript
… we basically need proof of 2 things, there is only one case, and if it matters, I think it matters, and i18n can tell us if this is the only case
… I'm convinced that these two styles exist, but I'm not sure how prevalent they are
… that would be useful to document
… klreq only says a simple sentence

addison: we should raise it with the klreq folks and others

florian: do you want to do the parallel efforts with the ethiopic group to say that if they're satisfied with the default line breaking
… or should we switch to another state without breaking other languages

r12a: Sundanese break on syllable boundaries

florian: that's a related but different problem
… we're talking about languages that can do 2 things

ic character

<fantasai> w3c/csswg-drafts#7577 (comment)

fantasai: CSS needs to know the width of a CJK character
… we define it using the ic unit
… the spec current uses 水 (water)
… there is a large controversy
… people want to change it to 永 (eternal)
… but water is more common
… there are a lot of discussions, maybe we should swith it to eternal
… there was a recent comment from someone asking to switch away from either of those characters to the ideographic space U+3000
… my recollection was that U+3000 was slightly narrower
… but I could be wrong

<florian> water = 水

<florian> eternity = 永

florian: we need a definition for ic, and the current spec uses water
… because it's extremely common
… but eternal is a standard of Asian calligraphy

fantasai: it's largely symbolic
… the country character is a box, but the box is narrower

<fantasai> xfq_: Technically water is the better choice for this, because it's more common

<fantasai> ... and as someone commented, it's in the more common bucket in fonts

<fantasai> ... Though that could maybe be changed

<fantasai> ... I don't expect any font not including the eternal character

<fantasai> ... because it's also very common, at least in Chinese and Japanese

<fantasai> ... and Korean doesn't use Hanja much

<fantasai> ... so I'm not sure about the frequency there

fantasai: water is the name of one of the day of a week

xfq_: in Japanese
… but not in Chinese

r12a: why do we have an ic unit, in addition to an em unit?

fantasai: we have an em unit, but not all fonts are square

florian: em is not the size of the letter m
… we have a ch unit which gives the size of a character in a monospace font
… we're designing a char for the CJK context
… is it true that U+3000 is narrow in a proportional font

MS Mincho Proportional

MS PGothic

1. tate-chu-yoko width

2. ic unit width

3. ICFT measurement

(for drop-cap alignment)

fantasai: there are 3 purposes we currently use water for ^
… there has been a lot of people arguing about it in the issue

florian: it is spec'd this way and implemented this way, but people are complaining

fantasai: as far as I can tell the complaints are all symbolic

florian: request for i18n is either back us up that we're correct, or come up with something that's more useful

addison: I'd be more nervous to come back with more opinions

florian: we already had a decision, should we change it?

fantasai: it is testable, if you make a font with different widths for these characters; but won't make any practical difference because a normal font won't have any difference

xfq_: I haven't seen any non-experimental proportional CJK fonts that includes proportional Han characters

florian: I think I've seen fonts where there the puctuation characters are slightly narrower, but that's it

r12a: Jen from Apple posted that Safari Technology Preview will support custom counter styles
… not sure if they will support iOS

florian: it's the same engine, so I would be suprised if they don't

Summary of action items

  1. addison: follow up on custom property namespacing
  2. addison: follow up with richard about i18n-discuss#11
Minutes manually created (not a transcript), formatted by scribe.perl version 210 (Wed Jan 11 19:21:32 2023 UTC).

Diagnostics

Succeeded: s/Hangual/Hangul

Succeeded: i/fantasai/florian: Although I said modern/new vs traditional/old, both styles are in use today

Succeeded: s/@@1/I'd imagine CJ behavior is more common with justification, since it reduces the amount of space that needs to be introduced to justify the line/

Succeeded: s/except that @@/except that Hangul is reclassified to behave like Latin/

Succeeded: s/the value we have/the other values we have/

Succeeded: s/something generic/something generic, but if we did, we could have keep(<scriptname>+) which reclassifies that script to behave like Latin; and break(<scriptname>+) which reclassifies that script to behave like CJK/

Succeeded: s/if all the content is language tagged propperly/if all the content is language tagged properly, all desirable behavior is available/

Succeeded: s/泳/永/

Succeeded: s/it's/water is the name of/

Succeeded: s/testable/testable, if you make a font with different widths for these characters; but won't make any practical difference because a normal font won't have any difference/

Succeeded: s/propertional CJK fonts/proportional CJK fonts that includes proportional Han characters/

Succeeded: s/why do we have an ic unit, but not an em unit?/why do we have an ic unit, in addition to an em unit?

Maybe present: xfq_

All speakers: addison, fantasai, florian, r12a, xfq_

Active on IRC: addison, fantasai, florian, r12a, xfq_