Meeting minutes
Introductions
<addison> Richard Ishida
Martin Dürst
Eemeli Aro
Florian Rivoal
Bert Bos
Fuqiao Xue
Daisuke Shiohara
Ryusei Saijiki
Bobby Tung
Addison Phillips
https://
Agenda Parking Lot
<r12a> oops
addison: the parking lot is just requests that we have
DOM localization
eemeli: a couple of different items on dom localization
… should this be a thing that should be done?
and if we do would this be going into the html spec eventually?
… separately from that where do we work on and incubate and standardize the representation of a message resource as a file format?
… does not belong in the html spec
… is this group a right place to incubate some or all parts of this?
… or should that technically be happening somwhere else?
… it is sounding to me like it would be much more benefical to talk about dom l10n more in depth after tomorrow's breakout session
florian: i would appreciate a 5-minute intro
addison: i would suggest maybe if this is necessary do it a little later
… there is no way to localize a web app built-in
… everybody rolling their own little l10n thing
… eemeli and i have been working on MF2 for a while
… would like to see MF2 be a native participant in the web
r12a: tomorrow morning there's also the wcag and non-latin language breakout
… i can go to that
… i would be useful there
… i don't know whether anybody wlse needs to be at the wcag one
addison: i suspect it's going to take more than one conversation
IRI vulnerability, IRI status in general (cf. RFC3987, WHATWG URL)
addison: the errata piece is a short conversation
… the larger conversation is we haven't finished this work
… WHATWG URL etc.
r12a: the ICANN UA Expert Group is looking at what standards need to be address
… you came up obviously with the IRI stuff
… just FYI
eemeli: @@1
eemeli: is there any representation of what is missing from the URL spec for it to be a full successor to the IRI spec?
martin: i think it depends on probably for web browsers there's not much that is missing
… on the other hand there are things like how exactly should bidi work?
… that's a very difficult problem
addison: I know mark davis is working on linkification
… if that were to turn to a standard of some sort
… you would want harmony on that
martin: it's very clearly a problem
… if there's an easy solution somebody would easily do it
… but the problem is that there's no solution
… and in browsers @@ non-ASCII copy that out it turns into percent escaping
addison: the address bar is a special place
… the challenge is that the address bar is not the only place where urls need to go
martin: it's more like a UI issue
eemeli: feels like the easiest here is to consider to consider all of the slashes to be directionally ltr
… and to break up parts according to those
martin: the IRI spec currently say something but browsers do it a little bit different
addison: the url standard has some about presentation maybe not full
eemeli: can we identify the pieces that ought to be added to the url standard so that we could possibly even deprecate the iri spec?
… martin would you be interested in putting something on a list of what is missing from the URL standard?
martin: i can do that
ACTION: martin: create a list of gaps in URL standard
<gb> Cannot create action. Validation failed. Maybe martin is not a valid user for w3c/i18n-actions?
ACTION: duerst: create a list of gaps in URL standard
<gb> Cannot create action. Validation failed. Maybe duerst is not a valid user for w3c/i18n-actions?
ACTION: addison: remind @duerst to create a list of gaps in URL standard
<gb> Created action #196
<gb> Action 196 remind @duerst to create a list of gaps in URL standard (on aphillips, duerst) due 2025-11-17
CSS
addison: we should think about how best to engage
https://
addison: i think that meeting at one time was very helpful
addison: we have shared interest
florian: i think the problem is not shared interest
… my lack of attendance of our sync up meetings
… i don't believe i can realistically be that champion
… tho i wish i could
addison: as long as css uses some mechanism like our agenda+ tag
… or something to say this one is currently active and so interaction would be useful
r12a: i18n is part of the architecture , not an add-on
florian: i completely agree with that
… i don't think there is a general neglect of the i1n apects
… but i18n questions can be of cvarious levels of complexity
addison: we do tend to notice when there is action on something
… at least i'm looking for new pending issues
… we'll need to make sure joel gets that message
… there are some higher level things like physical versus logical
… on our radar for multiple years
… needs to get done
r12a: it needs to be done because customers for css are people from all around the world
… one thing we could look at is there's a champion in the csswg for i18n
… the champion doesn't have to know anything in great detail about i18n
… but they need to be aware when those discussions need to take place
… and they need to engage discussion
florian: when we design something new do we take i18n considerations into account properly? i would say the answer is yes we do
… logical vs physical thing is the language as a whole has a gap
… maybe it's insuffcient lack of attnetion
r12a: for me the problem is not so much the detailed work
… but having somebody monitoring what's happening and trying to facilitate discussions
… we don't have any meetings scheduled anymore
addison: we didn't accomplish anything
r12a: it's not "let's do extra i18n things"
… it's a case of making sure that CSS meets the needs of the world
… it's a part of the process of building CSS
Requirements for the layout of rosters
<gb> Issue 268 Requirements for the layout of rosters (by xfq) [未來工作/future] [i:justification]
xfq: common in traditional media, alignment by column
… aligned in three characters, and for two character cell there is space in middle
… how to do this with CSS
xfq: dot in second line connects two, in half size
… for thrid column it should align at the first
florian: for current CSS, text-justify
… justify by name but not with character, name need to be marked up by span or some
florian: flexbox might also work potentially
xfq: ideally alignment should not be done by ideographic character, but by system spacing
r12a: names should have minimum length, and to be aligned by system
florian: I think you can get close to this with either flex or grid
… either will have different shortcomings
martin: maybe clreq can see how far they get with grid/flex
… and CSSWG can help
r12a: what would help is if you could put the actual text for that black box in the github issue
xfq: I can do that
CSS
https://
Ruby
r12a: i want to show people what we have in terms of finding issues related to a particular language such as japanese and ruby and so on
<r12a-again> https://
<r12a-again> https://
r12a: this is the i18n homepage
… if you look under language enablement there's a link called language enablement index
… if you scroll down that page
<r12a-again> https://
[r12a shows the page]
<r12a-again> https://
<r12a-again> https://
<r12a-again> https://
<r12a-again> https://
florian: i know that there is a wealth of interconnected information in these sets of pages
… it's never quite been clear to me on when i'm supposed to go there
r12a: if you want to check reqs go the reqs page
… if you want to check tests go to the tests page
Fonts
bobby: limitations about local fonts in CSS
… cannnot use system fonts like Kai and Fangsong
… important in Chinese typography
… use for emphasize etc.
… some system fonts only have one weight
… we do not use italics
… we cannot use synthetic oblique fonts
… we change typeface for emphasis
… if there's no way to use a local font
… there is not way to indicate emphasis
… we talked about this in clreq calls
… finally we have a new generic() function in CSS fonts L4
… but I don't know when it will be implemented in browsers
… that's the problem
… it's related to the CSS-i18n champion issue we just talked about
bobby: we have documented this in clreq
xfq: they are documented in the gap analysis
eemeli: i can fowrard this to the right people
r12a: who should we talk to?
eemeli: Henri
… does quite a bit of work with characters
<atsushi> xfq: we had discussed within i18n, and have more than 10 trackers on this
<atsushi> r12a: let's take this up again when florian is back
<atsushi> xfq: for ruby, Murata-san is wokring on, and will join tomorrow?
Glossary and the normative approach
https://
<atsushi> xfq: have discussed this glossary for a while, ready to merge?
<gb> Pull Request 95 Update the definition of 'Mojibake' (by xfq)
<atsushi> xfq: #95 for Mojibake
https://
<gb> CLOSED Action 95 write endorsement of html ruby markup extensions (on aphillips) due 2024-05-02
[xfq introduces the PR]
martin: it's not an issue of encoding, but an issue of decoding
boby: we can still find some old web pages on the web that use shift-jis and when you open it with modern browsers
… they decode it with utf-8
… and you cannot read anything
r12a: tofu is a different thing
… lack of glyph
martin: or lack of fonts
r12a: how to say mojibake in chinese?
bobby: 乱码
Luànmǎ
乱 means disorder, confused
码 means encoding
<bobby> https://
xfq: specdev uses Mojibake
… i don't think any other spec uses it
r12a: so it's a informative term and we can have Luànmǎ too
… maybe it's not so important for the specs
<gb> https://github.com/w3c/i18n-glossary/pull/89
[xfq introduces the PR]
ok to merge
<gb> Pull Request 88 Update the definition of 'Bidirectional isolate' and 'Bidi isolation' (by xfq)
<atsushi> [xfq shows on screen for discussion on text/change in PR]
ok to merge
https://
<gb> Pull Request 91 Update the definition of 'First-strong detection' (by xfq)
martin: maybe say something like first-strong is used when auto is set
xfq: i can add a link
r12a: "then uses that to guess at the appropriate base direction for the string as a whole" is missing from the new def
martin: "guess" is the core here
… it should be used when the directionality is not known yet
Open issues and PRs
<atsushi> xfq: jumping into pending issues
<gb> Pull Request 701 Update qa-i18n (by xfq)
<atsushi> xfq: raised PR while ago, adding line to list of i18n targets
martin: "Keyboard usage" to "Keyboard layout and usage"
r12a: "Accessibility requirements" is too vague
… maybe things like "readability requirements" and "legal requirements"
… "script-specific readability requirements"
<gb> Pull Request 702 Add a brief mention of security issues (by xfq)
https://
eemeli: "inserting it into HTML"
… from a reader's point of view there's a little ambiguity of what "inserting it into HTML" means
… the way you're using it is correct
… but it is easy to misunderstand
r12a: and it's only the syntax characters
… if i say hello in another language
… you don't need to escape it
https://
<gb> Pull Request 705 Use "text content" instead of "content" (by xfq)
r12a: we can remove "There are many character encodings to choose from."
Mention layout mirroring for bidi
<gb> Pull Request 163 Mention layout mirroring for bidi (by xfq)
https://
<Bert> xfq: There were some comments and I made a pull request ^^
eemeli: "It should preferably automatic" -> "It should be preferably automatic"
<Bert> eemeli: Typo: missing "be"
<Bert> martin: what languages use sloping in both directions (in section 9.4)?
<Bert> r12a_: In Hebrew it is a choice
Add some best practices to string-search
<gb> Pull Request 28 Add some best practices (by xfq)
<r12a_> https://
<r12a_> https://
<Bert> xfq: This is about string searching. I added that UAs should by default offer case-insensitive searching, using Unicode case folding.
<Bert> eemeli: Section 5.18 of Unicode 17
<Bert> xfq: Current string search document doesn't refer to Unicode diretcly, but does point to charmod-norm, which does.
<Bert> ... That might be enough.
<Bert> eemeli: Maybe better to link directly and reduce need for clicks.
<Bert> ... Another typo: s/forms forms/character forms/
<Bert> r12a_: Or maybe just characters, instead of character forms.
<Bert> xfq: Another patch is to add ‘User agents MAY normalize numeric values to their ASCII forms (0-9) in string searching operations.’
<Bert> eemeli: Is that about characters that represent numbers?
<Bert> xfq: Yes
<Bert> hsivonen: Is this document normative?
<Bert> ... Performance difference on long documents for collator-based search.
<hsivonen> https://
<Bert> ... There is a request to me to write about this. Haven't written it yet. Let me paste something ^^
<Bert> ... Firefox doesn't do some things from this list.
<Bert> xfq: I'll read though that document.
<Bert> hsivonen: Firefox probably doesn't want to add a checkbox to fold numbering systems, but it could be treated as an accent difference.
<Bert> xfq: That's why I wrote ‘may’.
<Bert> ... as in ‘MAY provide an option for diacritics-sensitive search’
<Bert> r12a_: About ASCII digits: also search for the value if the number is not decimal?
<Bert> eemeli: It is a ‘may’
<Bert> ... Allow implementers to think about what can be done.
<Bert> r12a_: I've been searching a lot and keep finding things that I don't want. Such as finding é when I really want e.
<Bert> eemeli: In many languages, letters with accent are really different and you don't want to mix them.
<Bert> xfq: Maybe the ‘should‘ in the diacritics rule should be a ‘may’ then.
<Bert> r12a_: For me that applies to digits, too: I often do want to search for the character, not for anything with that value.
<Bert> hsivonen: Accent-sensitive search per language, e.g., for Finnish.
<Bert> ... I have only once seen a complaint about that.
<Bert> ... That is what Firefox do.
<Bert> ... Chrome and Safari search accent-insensitive.
<Bert> ... But if your UI language is Finnish or Swedish, then accents that are analyzed to form a separate base letter are not ignored.
<Bert> ... Need documenting what the cases and languages are. I've been asked to write it up, but haven't done so yet.
<Bert> eemeli: So we should change the ‘should’ (in ‘SHOULD ignore diacritics’) to a ‘may’.
<Bert> xfq: UAs may provide different UIs.
<Bert> Looking at example in the spec of Dürst vs Duerst for German.
<Bert> Introduction of some observers: Nicolò, Andreu and Itoe
DOM localization
<Bert> eemeli: Breakout about this tomorrow.
<Bert> ... Pretty easy to get 90 or 95% of the way.
<Bert> ... But the last few percent can be hard, depending on the model you start with.
<Bert> ... We have quite a bit of experience with localization of UI/UX.
<Bert> ... Localization is more than translation.
<Bert> ... Automated translation is pretty decent these days, but it different for words in a UI. What does ‘accept» on a button mean?
<Bert> ... Still need human translators.
<Bert> ... Goal is to have the web platform support localization, i.e., HTML.
<Bert> ... Compare how CSS attached to element.
<Bert> ... A lot of work has been done in Unicode on MessageFormat.
<Bert> ... What does a single message look like? How do you format it?
<Bert> ... We need an imperative way to do localization, as well as a declarative way.
<Bert> ... This needs work in HTML, but also work on a file format for holding the information. JSON probably not good enough.
<Bert> ... There are various formats in use, including JSON or XML-based.
<Bert> ... Question for this group is how much of the incubation for this should happen here?
<Bert> ... Firefox has a lot experience with this and we have a system for building the frontend this way.
<Bert> florian: It says ‘DOM’ localization. You mean a system where the localization happens in one document, with one URL?
<Bert> eemeli: There is no single correct solution. Can be via a URL or some other state.
<Bert> florian: You mentioned the Firefox UI, which doesn't have a URL.
<Bert> eemeli: Not a visible URL, but internally it has a similar identifier.
<Bert> florian: So also for different versions of a local document, as for an app?
<Bert> eemeli: Yes.
<Bert> ... Breakout session is tomorrow morning.
CSS
Ruby
r12a: there's markup and CSS for ruby
… ruby is used in Chinese and Japanese
… Korean and Mongolian a little bit
… a couple of years ago, you had a bit of money to develop an HTML add-on spec
… it'd be great to know what the progress is
[florian talks about the funding issue]
florian: my plan is this month to chase up in what has happened during in the horizontal review
… which is a little bit from i18n and not from anyone else
… my understanding is that firefox implements all that we have in the spec
… amazon kindle implements some of what we have in this spec
… the part that would be pushed to a level 2
… it's basically the rtc part of the markup
… and the multi-layered ruby
… within this month or so enough work to actually call for CR
… on the CSS side of things there remains plenty of work to do
… there's diminishing returns to working on the spec
… before impls catch up
… the CSS spec pretty much picks the html set extension we've been talking about
… i don't know how productive it is you're too far ahead of the impls
r12a: i don't think we can expect much movement on the impls before we have the markups
… we have it in draft form, it is published as a WD
r12a: would it be possible to create L1 and L2 at the same time?
florian: yes
[Discuss how and when to draft L2]
florian: for example, for rtc, my plan will be to leave it in L1, marked at risk until we're forced to trying to go to REC
[Discuss if we ever want to go to REC]
florian: while we have 2 impls
… one of them is not a browser
… that means this text will not be accepted by whatwg
… 2. we have some maintenance work to do on it
… tts representation of ruby
… will likely need some additional attributes with some new values
… we would offer pull requests against the HTML spec to keep that subset in sync
r12a: it just seems to me that the going to REC bit adds extra time and effort
… and knocks things out of the spec that haven't been implemented
… we can do CR and push implementers to implement it
… then it just makes life simpler
r12a: rtc is not that much more to do
… it's in the parser already
[Discuss how to test it in Amazon Kindle]
CSS
bobby: four Chinese typeface styles
… Chinese do not have italics
… we change typefaces for emphasis
… like switching between Hei (sans-serif) and Kai
… we can list all Kai system fonts
… that's stupid, but works
… CSS fonts L4 introduced a generic() function
xfq: @@2
florian: the font fingerprinting problem is more than tricky
… it's hard
r12a: I seem to remember we were getting closer
florian: I think we were getting pretty close in terms of allowing people to do various things
… some of which might be the right one
<gb> Issue 11775 [meta][css-fonts-4] Index of local font issues: fingerprinting, I18n, privacy (by svgeesus) [css-fonts-4] [i18n-tracker] [meta] [privacy-tracker]
<bobby> https://
bobby: another case
… Unihan Extension G
… very recent new block in Unicode
… the Jigmo font has glyphs from extention G
… but if it's a local font
… Safari can't load it
… and it's a large font
<xfq> I made a demo a while ago: https://
florian: if we want to talk about multiple things with CSSWG, probably do not start with this issue
… it will consume all the time
<gb> Issue 11257 [css-text-decor] Control the line height / proximity of text containing emphasis marks (by xfq) [css-text-decor-3] [css-text-decor-4] [i18n-needs-resolution] [i18n-jlreq] [i18n-clreq] [i18n-klreq] [i18n-mlreq]
<gb> CLOSED Issue 10844 [css-overflow] Line-clamp and approaches to ellipsis insertion (by frivoal) [css-overflow-4] [Closed Accepted by CSSWG Resolution] [i18n-tracker] [Needs Testcase (WPT)] [i18n-jlreq] [i18n-alreq] [topic: line-clamp]
Andreu: CSSWG #10844
<gb> Issue 10844 not found
Andreu: this is a closed issue
… but I do not agree with Addison's comments in the issue
florian: I'll introduce line-clamp
… there already exists something in CSS which people often confuse this with
… we're not talking about the thing that lets you add a dot dot dot at the end of a line
… that exceeds its box
… when a line is too long and it overflows in the inline direction
florian: we have solid agreement with i18n and CSS WGs is that the chopping should happen logically not physically
… when we're doing this in multiple lines
… but
… the removal of extra content to make room for the ellipsis is logical
… but physically where does the ellipsis go?
Andreu: the ellipsis indicates that the text is truncated
… does it indicate that the embedding level is truncated or does it indicate that the paragraph level is truncated?
… I showed several examples to Arabic and Hebrew speakers
… including multiple nested levels of Hebrew and English
… they did seem to agree that it would be better to place the ellipsis at the visual end of the line
… the way the CSSWG has resolved on this is in agreement with what Andreu wants to do
… I don't think is in conflict with what i18n WG has said as a formal resolution
… however
… the last comment that Addison left
… seems to suggest another way
eemeli: I think if you've got user research, even if it's informal
… that is strongly indicative that speakers think paragraph level makes more sense
… that sounds very believable to me
… this feels like a thing that what the humans expect does not necessarily match what logic might dictate
… or you can argue the logic either way
Andreu: I was trying to implement Addison's suggestion in Chrome
… this is completely alien to the way that Chrome or other browsers do things
… because it's just at the wrong level
… at the wrong place in the layout stage
Bert: it depends on what kind of symbol you use
… ellipsis vs arrow
… if you end with a hyphenated word
r12a: be careful when you're saying hyphenation
… do you mean words with hyphens in between
… or do you mean end of line?
florian: currently we use the same logic as what we use for line breaking
… we're trying to reuse the existing mechanism of CSS
… avoid reinventing them poorly
Andreu: in my impl
… I had just been assuming that you can compute the answers ahead of time
Bert: hanging ellipsis?
florian: separate question
eemeli: if you're a human dealing with this
… @@2
r12a: Arabic language does not use hyphenation
… but Arabic script used for Uyghur
… you'll find lots of hyphenation
r12a: Persian doesn't
<gb> Issue 8 Hebrew Hyphen (by r12a) [i:segmentation] [s:hebr]
breakout sessions
r12a: wcag is trying to create readability guidelines
… like leaving a certain amout of spacing between lines
… it works for english
… but not necessary for other scripts
… they put together a task force that is looking at hwo they can extend wcag guidelines so that it meets the needs of people who use different scripts
… they're struggling a bit in terms of how they're gonna capture that info
… they've tried to choose 5 scripts
… latin, cyrillic