W3C

– DRAFT –
Internationalization Working Group - TPAC 2025

09 November 2025

Attendees

Present
Addison, Andreu, atsushi, Bert, Bert`, Bobby, Daisuke Shiohara, Eemeli, Florian, Fuqiao, Itoe, Martin, Nicolò, Ryusei Saijiki
Regrets
-
Chair
addison
Scribe
xfq, addison, atsushi, Bert

Meeting minutes

Introductions

<addison> Richard Ishida

Martin Dürst

Eemeli Aro

Florian Rivoal

Bert Bos

Fuqiao Xue

Daisuke Shiohara

Ryusei Saijiki

Bobby Tung

Addison Phillips

https://www.w3.org/events/meetings/86ea031d-776b-426e-aa2a-bdf6ba6d50af/

Agenda Parking Lot

<r12a> oops

addison: the parking lot is just requests that we have

DOM localization

eemeli: a couple of different items on dom localization
… should this be a thing that should be done?

and if we do would this be going into the html spec eventually?
… separately from that where do we work on and incubate and standardize the representation of a message resource as a file format?
… does not belong in the html spec
… is this group a right place to incubate some or all parts of this?
… or should that technically be happening somwhere else?
… it is sounding to me like it would be much more benefical to talk about dom l10n more in depth after tomorrow's breakout session

florian: i would appreciate a 5-minute intro

addison: i would suggest maybe if this is necessary do it a little later
… there is no way to localize a web app built-in
… everybody rolling their own little l10n thing
… eemeli and i have been working on MF2 for a while
… would like to see MF2 be a native participant in the web

r12a: tomorrow morning there's also the wcag and non-latin language breakout
… i can go to that
… i would be useful there
… i don't know whether anybody wlse needs to be at the wcag one

addison: i suspect it's going to take more than one conversation

IRI vulnerability, IRI status in general (cf. RFC3987, WHATWG URL)

addison: the errata piece is a short conversation
… the larger conversation is we haven't finished this work
… WHATWG URL etc.

r12a: the ICANN UA Expert Group is looking at what standards need to be address
… you came up obviously with the IRI stuff
… just FYI

eemeli: @@1

eemeli: is there any representation of what is missing from the URL spec for it to be a full successor to the IRI spec?

martin: i think it depends on probably for web browsers there's not much that is missing
… on the other hand there are things like how exactly should bidi work?
… that's a very difficult problem

addison: I know mark davis is working on linkification
… if that were to turn to a standard of some sort
… you would want harmony on that

martin: it's very clearly a problem
… if there's an easy solution somebody would easily do it
… but the problem is that there's no solution
… and in browsers @@ non-ASCII copy that out it turns into percent escaping

addison: the address bar is a special place
… the challenge is that the address bar is not the only place where urls need to go

martin: it's more like a UI issue

eemeli: feels like the easiest here is to consider to consider all of the slashes to be directionally ltr
… and to break up parts according to those

martin: the IRI spec currently say something but browsers do it a little bit different

addison: the url standard has some about presentation maybe not full

eemeli: can we identify the pieces that ought to be added to the url standard so that we could possibly even deprecate the iri spec?
… martin would you be interested in putting something on a list of what is missing from the URL standard?

martin: i can do that

ACTION: martin: create a list of gaps in URL standard

<gb> Cannot create action. Validation failed. Maybe martin is not a valid user for w3c/i18n-actions?

ACTION: duerst: create a list of gaps in URL standard

<gb> Cannot create action. Validation failed. Maybe duerst is not a valid user for w3c/i18n-actions?

ACTION: addison: remind @duerst to create a list of gaps in URL standard

<gb> Created action #196

<gb> Action 196 remind @duerst to create a list of gaps in URL standard (on aphillips, duerst) due 2025-11-17

CSS

addison: we should think about how best to engage

https://github.com/w3c/i18n-activity/issues?q=is%3Aissue%20state%3Aopen%20label%3Awg%3Acss ==> 156 open

addison: i think that meeting at one time was very helpful

https://github.com/w3c/i18n-activity/issues?q=is%3Aissue%20state%3Aopen%20label%3AAgenda%2BI18N%2BCSS

addison: we have shared interest

florian: i think the problem is not shared interest
… my lack of attendance of our sync up meetings
… i don't believe i can realistically be that champion
… tho i wish i could

addison: as long as css uses some mechanism like our agenda+ tag
… or something to say this one is currently active and so interaction would be useful

r12a: i18n is part of the architecture , not an add-on

florian: i completely agree with that
… i don't think there is a general neglect of the i1n apects
… but i18n questions can be of cvarious levels of complexity

addison: we do tend to notice when there is action on something
… at least i'm looking for new pending issues
… we'll need to make sure joel gets that message
… there are some higher level things like physical versus logical
… on our radar for multiple years
… needs to get done

r12a: it needs to be done because customers for css are people from all around the world
… one thing we could look at is there's a champion in the csswg for i18n
… the champion doesn't have to know anything in great detail about i18n
… but they need to be aware when those discussions need to take place
… and they need to engage discussion

florian: when we design something new do we take i18n considerations into account properly? i would say the answer is yes we do
… logical vs physical thing is the language as a whole has a gap
… maybe it's insuffcient lack of attnetion

r12a: for me the problem is not so much the detailed work
… but having somebody monitoring what's happening and trying to facilitate discussions
… we don't have any meetings scheduled anymore

addison: we didn't accomplish anything

r12a: it's not "let's do extra i18n things"
… it's a case of making sure that CSS meets the needs of the world
… it's a part of the process of building CSS

Requirements for the layout of rosters

w3c/clreq#268

<gb> Issue 268 Requirements for the layout of rosters (by xfq) [未來工作/future] [i:justification]

w3c/clreq#268

xfq: common in traditional media, alignment by column
… aligned in three characters, and for two character cell there is space in middle
… how to do this with CSS

xfq: dot in second line connects two, in half size
… for thrid column it should align at the first

florian: for current CSS, text-justify
… justify by name but not with character, name need to be marked up by span or some

florian: flexbox might also work potentially

xfq: ideally alignment should not be done by ideographic character, but by system spacing

r12a: names should have minimum length, and to be aligned by system

florian: I think you can get close to this with either flex or grid
… either will have different shortcomings

martin: maybe clreq can see how far they get with grid/flex
… and CSSWG can help

r12a: what would help is if you could put the actual text for that black box in the github issue

xfq: I can do that

CSS

https://github.com/w3c/i18n-activity/issues?q=is%3Aissue%20state%3Aopen%20label%3Awg%3Acss

Ruby

r12a: i want to show people what we have in terms of finding issues related to a particular language such as japanese and ruby and so on

<r12a-again> https://www.w3.org/International/

<r12a-again> https://www.w3.org/TR/typography/

r12a: this is the i18n homepage
… if you look under language enablement there's a link called language enablement index
… if you scroll down that page

<r12a-again> https://www.w3.org/TR/arab-lreq/#vertical_text

[r12a shows the page]

<r12a-again> https://www.w3.org/TR/jpan-lreq/#inline_notes

<r12a-again> https://www.w3.org/TR/hani-lreq/#inline_notes

<r12a-again> https://www.w3.org/TR/kore-lreq/#inline_notes

<r12a-again> https://www.w3.org/TR/mong-lreq/#inline_notes

florian: i know that there is a wealth of interconnected information in these sets of pages
… it's never quite been clear to me on when i'm supposed to go there

r12a: if you want to check reqs go the reqs page
… if you want to check tests go to the tests page

Fonts

bobby: limitations about local fonts in CSS
… cannnot use system fonts like Kai and Fangsong
… important in Chinese typography
… use for emphasize etc.
… some system fonts only have one weight
… we do not use italics
… we cannot use synthetic oblique fonts
… we change typeface for emphasis
… if there's no way to use a local font
… there is not way to indicate emphasis
… we talked about this in clreq calls
… finally we have a new generic() function in CSS fonts L4
… but I don't know when it will be implemented in browsers
… that's the problem
… it's related to the CSS-i18n champion issue we just talked about

bobby: we have documented this in clreq

xfq: they are documented in the gap analysis

eemeli: i can fowrard this to the right people

r12a: who should we talk to?

eemeli: Henri
… does quite a bit of work with characters

<atsushi> xfq: we had discussed within i18n, and have more than 10 trackers on this

<atsushi> r12a: let's take this up again when florian is back

<atsushi> xfq: for ruby, Murata-san is wokring on, and will join tomorrow?

Glossary and the normative approach

https://github.com/w3c/i18n-glossary/pulls

<atsushi> xfq: have discussed this glossary for a while, ready to merge?

w3c/i18n-glossary#95

<gb> Pull Request 95 Update the definition of 'Mojibake' (by xfq)

<atsushi> xfq: #95 for Mojibake

https://github.com/w3c/i18n-glossary/pull/95/files

<gb> CLOSED Action 95 write endorsement of html ruby markup extensions (on aphillips) due 2024-05-02

[xfq introduces the PR]

martin: it's not an issue of encoding, but an issue of decoding

boby: we can still find some old web pages on the web that use shift-jis and when you open it with modern browsers
… they decode it with utf-8
… and you cannot read anything

r12a: tofu is a different thing
… lack of glyph

martin: or lack of fonts

r12a: how to say mojibake in chinese?

bobby: 乱码

Luànmǎ

乱 means disorder, confused

码 means encoding

<bobby> https://zh-yue.wikipedia.org/wiki/亂碼

xfq: specdev uses Mojibake
… i don't think any other spec uses it

r12a: so it's a informative term and we can have Luànmǎ too
… maybe it's not so important for the specs

w3c/i18n-glossary#89

<gb> https://github.com/w3c/i18n-glossary/pull/89

[xfq introduces the PR]

ok to merge

w3c/i18n-glossary#88

<gb> Pull Request 88 Update the definition of 'Bidirectional isolate' and 'Bidi isolation' (by xfq)

<atsushi> [xfq shows on screen for discussion on text/change in PR]

ok to merge

https://github.com/w3c/i18n-glossary/pull/91/files

<gb> Pull Request 91 Update the definition of 'First-strong detection' (by xfq)

martin: maybe say something like first-strong is used when auto is set

xfq: i can add a link

r12a: "then uses that to guess at the appropriate base direction for the string as a whole" is missing from the new def

martin: "guess" is the core here
… it should be used when the directionality is not known yet

Open issues and PRs

<atsushi> xfq: jumping into pending issues

w3c/i18n-drafts#701

<gb> Pull Request 701 Update qa-i18n (by xfq)

<atsushi> xfq: raised PR while ago, adding line to list of i18n targets

martin: "Keyboard usage" to "Keyboard layout and usage"

r12a: "Accessibility requirements" is too vague
… maybe things like "readability requirements" and "legal requirements"
… "script-specific readability requirements"

w3c/i18n-drafts#702

<gb> Pull Request 702 Add a brief mention of security issues (by xfq)

https://deploy-preview-702--i18n-drafts.netlify.app/questions/qa-escapes.en.html#security

eemeli: "inserting it into HTML"
… from a reader's point of view there's a little ambiguity of what "inserting it into HTML" means
… the way you're using it is correct
… but it is easy to misunderstand

r12a: and it's only the syntax characters
… if i say hello in another language
… you don't need to escape it

https://github.com/w3c/i18n-drafts/pull/705/files

<gb> Pull Request 705 Use "text content" instead of "content" (by xfq)

r12a: we can remove "There are many character encodings to choose from."

Mention layout mirroring for bidi

w3c/bp-i18n-specdev#163

<gb> Pull Request 163 Mention layout mirroring for bidi (by xfq)

https://deploy-preview-163--bp-i18n-specdev.netlify.app/#typ_bidi_styling

<Bert> xfq: There were some comments and I made a pull request ^^

eemeli: "It should preferably automatic" -> "It should be preferably automatic"

<Bert> eemeli: Typo: missing "be"

<Bert> martin: what languages use sloping in both directions (in section 9.4)?

<Bert> r12a_: In Hebrew it is a choice

Add some best practices to string-search

w3c/string-search#28

<gb> Pull Request 28 Add some best practices (by xfq)

<r12a_> https://r12a.github.io/scripts/hebr/he.html#fontstyle

<r12a_> https://r12a.github.io/scripts/arab/arb.html#letterforms

<Bert> xfq: This is about string searching. I added that UAs should by default offer case-insensitive searching, using Unicode case folding.

<Bert> eemeli: Section 5.18 of Unicode 17

<Bert> xfq: Current string search document doesn't refer to Unicode diretcly, but does point to charmod-norm, which does.

<Bert> ... That might be enough.

<Bert> eemeli: Maybe better to link directly and reduce need for clicks.

<Bert> ... Another typo: s/forms forms/character forms/

<Bert> r12a_: Or maybe just characters, instead of character forms.

<Bert> xfq: Another patch is to add ‘User agents MAY normalize numeric values to their ASCII forms (0-9) in string searching operations.’

<Bert> eemeli: Is that about characters that represent numbers?

<Bert> xfq: Yes

<Bert> hsivonen: Is this document normative?

<Bert> ... Performance difference on long documents for collator-based search.

<hsivonen> https://docs.google.com/document/d/1nUCQxSCCIdfBas5l-jGu58O38FaCLuvlsBFAjvXrgNM/edit?tab=t.0

<Bert> ... There is a request to me to write about this. Haven't written it yet. Let me paste something ^^

<Bert> ... Firefox doesn't do some things from this list.

<Bert> xfq: I'll read though that document.

<Bert> hsivonen: Firefox probably doesn't want to add a checkbox to fold numbering systems, but it could be treated as an accent difference.

<Bert> xfq: That's why I wrote ‘may’.

<Bert> ... as in ‘MAY provide an option for diacritics-sensitive search’

<Bert> r12a_: About ASCII digits: also search for the value if the number is not decimal?

<Bert> eemeli: It is a ‘may’

<Bert> ... Allow implementers to think about what can be done.

<Bert> r12a_: I've been searching a lot and keep finding things that I don't want. Such as finding é when I really want e.

<Bert> eemeli: In many languages, letters with accent are really different and you don't want to mix them.

<Bert> xfq: Maybe the ‘should‘ in the diacritics rule should be a ‘may’ then.

<Bert> r12a_: For me that applies to digits, too: I often do want to search for the character, not for anything with that value.

<Bert> hsivonen: Accent-sensitive search per language, e.g., for Finnish.

<Bert> ... I have only once seen a complaint about that.

<Bert> ... That is what Firefox do.

<Bert> ... Chrome and Safari search accent-insensitive.

<Bert> ... But if your UI language is Finnish or Swedish, then accents that are analyzed to form a separate base letter are not ignored.

<Bert> ... Need documenting what the cases and languages are. I've been asked to write it up, but haven't done so yet.

<Bert> eemeli: So we should change the ‘should’ (in ‘SHOULD ignore diacritics’) to a ‘may’.

<Bert> xfq: UAs may provide different UIs.

<Bert> Looking at example in the spec of Dürst vs Duerst for German.

<Bert> Introduction of some observers: Nicolò, Andreu and Itoe

DOM localization

<Bert> eemeli: Breakout about this tomorrow.

<Bert> ... Pretty easy to get 90 or 95% of the way.

<Bert> ... But the last few percent can be hard, depending on the model you start with.

<Bert> ... We have quite a bit of experience with localization of UI/UX.

<Bert> ... Localization is more than translation.

<Bert> ... Automated translation is pretty decent these days, but it different for words in a UI. What does ‘accept» on a button mean?

<Bert> ... Still need human translators.

<Bert> ... Goal is to have the web platform support localization, i.e., HTML.

<Bert> ... Compare how CSS attached to element.

<Bert> ... A lot of work has been done in Unicode on MessageFormat.

<Bert> ... What does a single message look like? How do you format it?

<Bert> ... We need an imperative way to do localization, as well as a declarative way.

<Bert> ... This needs work in HTML, but also work on a file format for holding the information. JSON probably not good enough.

<Bert> ... There are various formats in use, including JSON or XML-based.

<Bert> ... Question for this group is how much of the incubation for this should happen here?

<Bert> ... Firefox has a lot experience with this and we have a system for building the frontend this way.

<Bert> florian: It says ‘DOM’ localization. You mean a system where the localization happens in one document, with one URL?

<Bert> eemeli: There is no single correct solution. Can be via a URL or some other state.

<Bert> florian: You mentioned the Firefox UI, which doesn't have a URL.

<Bert> eemeli: Not a visible URL, but internally it has a similar identifier.

<Bert> florian: So also for different versions of a local document, as for an app?

<Bert> eemeli: Yes.

<Bert> ... Breakout session is tomorrow morning.

CSS

Ruby

r12a: there's markup and CSS for ruby
… ruby is used in Chinese and Japanese
… Korean and Mongolian a little bit
… a couple of years ago, you had a bit of money to develop an HTML add-on spec
… it'd be great to know what the progress is

[florian talks about the funding issue]

florian: my plan is this month to chase up in what has happened during in the horizontal review
… which is a little bit from i18n and not from anyone else
… my understanding is that firefox implements all that we have in the spec
… amazon kindle implements some of what we have in this spec
… the part that would be pushed to a level 2
… it's basically the rtc part of the markup
… and the multi-layered ruby
… within this month or so enough work to actually call for CR
… on the CSS side of things there remains plenty of work to do
… there's diminishing returns to working on the spec
… before impls catch up
… the CSS spec pretty much picks the html set extension we've been talking about
… i don't know how productive it is you're too far ahead of the impls

r12a: i don't think we can expect much movement on the impls before we have the markups
… we have it in draft form, it is published as a WD

https://github.com/w3c/i18n-activity/issues?q=is%3Aissue%20state%3Aopen%20label%3As%3Ahtml-ruby-extensions

r12a: would it be possible to create L1 and L2 at the same time?

florian: yes

[Discuss how and when to draft L2]

florian: for example, for rtc, my plan will be to leave it in L1, marked at risk until we're forced to trying to go to REC

[Discuss if we ever want to go to REC]

florian: while we have 2 impls
… one of them is not a browser
… that means this text will not be accepted by whatwg
… 2. we have some maintenance work to do on it
… tts representation of ruby
… will likely need some additional attributes with some new values
… we would offer pull requests against the HTML spec to keep that subset in sync

r12a: it just seems to me that the going to REC bit adds extra time and effort
… and knocks things out of the spec that haven't been implemented
… we can do CR and push implementers to implement it
… then it just makes life simpler

r12a: rtc is not that much more to do
… it's in the parser already

[Discuss how to test it in Amazon Kindle]

CSS

bobby: four Chinese typeface styles
… Chinese do not have italics
… we change typefaces for emphasis
… like switching between Hei (sans-serif) and Kai
… we can list all Kai system fonts
… that's stupid, but works
… CSS fonts L4 introduced a generic() function

xfq: @@2

florian: the font fingerprinting problem is more than tricky
… it's hard

r12a: I seem to remember we were getting closer

florian: I think we were getting pretty close in terms of allowing people to do various things
… some of which might be the right one

w3c/csswg-drafts#11775

<gb> Issue 11775 [meta][css-fonts-4] Index of local font issues: fingerprinting, I18n, privacy (by svgeesus) [css-fonts-4] [i18n-tracker] [meta] [privacy-tracker]

<bobby> https://en.wikipedia.org/wiki/CJK_Unified_Ideographs_Extension_G

bobby: another case
… Unihan Extension G
… very recent new block in Unicode
… the Jigmo font has glyphs from extention G
… but if it's a local font
… Safari can't load it
… and it's a large font

<xfq> I made a demo a while ago: https://xfq.github.io/large-webfont/

florian: if we want to talk about multiple things with CSSWG, probably do not start with this issue
… it will consume all the time

w3c/csswg-drafts#11257

<gb> Issue 11257 [css-text-decor] Control the line height / proximity of text containing emphasis marks (by xfq) [css-text-decor-3] [css-text-decor-4] [i18n-needs-resolution] [i18n-jlreq] [i18n-clreq] [i18n-klreq] [i18n-mlreq]

w3c/csswg-drafts#10844

<gb> CLOSED Issue 10844 [css-overflow] Line-clamp and approaches to ellipsis insertion (by frivoal) [css-overflow-4] [Closed Accepted by CSSWG Resolution] [i18n-tracker] [Needs Testcase (WPT)] [i18n-jlreq] [i18n-alreq] [topic: line-clamp]

Andreu: CSSWG #10844

<gb> Issue 10844 not found

Andreu: this is a closed issue
… but I do not agree with Addison's comments in the issue

florian: I'll introduce line-clamp
… there already exists something in CSS which people often confuse this with
… we're not talking about the thing that lets you add a dot dot dot at the end of a line
… that exceeds its box
… when a line is too long and it overflows in the inline direction

florian: we have solid agreement with i18n and CSS WGs is that the chopping should happen logically not physically
… when we're doing this in multiple lines
… but
… the removal of extra content to make room for the ellipsis is logical
… but physically where does the ellipsis go?

Andreu: the ellipsis indicates that the text is truncated
… does it indicate that the embedding level is truncated or does it indicate that the paragraph level is truncated?
… I showed several examples to Arabic and Hebrew speakers
… including multiple nested levels of Hebrew and English
… they did seem to agree that it would be better to place the ellipsis at the visual end of the line
… the way the CSSWG has resolved on this is in agreement with what Andreu wants to do
… I don't think is in conflict with what i18n WG has said as a formal resolution
… however
… the last comment that Addison left
… seems to suggest another way

eemeli: I think if you've got user research, even if it's informal
… that is strongly indicative that speakers think paragraph level makes more sense
… that sounds very believable to me
… this feels like a thing that what the humans expect does not necessarily match what logic might dictate
… or you can argue the logic either way

Andreu: I was trying to implement Addison's suggestion in Chrome
… this is completely alien to the way that Chrome or other browsers do things
… because it's just at the wrong level
… at the wrong place in the layout stage

Bert: it depends on what kind of symbol you use
… ellipsis vs arrow
… if you end with a hyphenated word

r12a: be careful when you're saying hyphenation
… do you mean words with hyphens in between
… or do you mean end of line?

florian: currently we use the same logic as what we use for line breaking
… we're trying to reuse the existing mechanism of CSS
… avoid reinventing them poorly

Andreu: in my impl
… I had just been assuming that you can compute the answers ahead of time

Bert: hanging ellipsis?

florian: separate question

eemeli: if you're a human dealing with this
… @@2

r12a: Arabic language does not use hyphenation
… but Arabic script used for Uyghur
… you'll find lots of hyphenation

r12a: Persian doesn't

w3c/hlreq#8

<gb> Issue 8 Hebrew Hyphen (by r12a) [i:segmentation] [s:hebr]

breakout sessions

r12a: wcag is trying to create readability guidelines
… like leaving a certain amout of spacing between lines
… it works for english
… but not necessary for other scripts
… they put together a task force that is looking at hwo they can extend wcag guidelines so that it meets the needs of people who use different scripts
… they're struggling a bit in terms of how they're gonna capture that info
… they've tried to choose 5 scripts
… latin, cyrillic

Summary of action items

  1. martin: create a list of gaps in URL standard
  2. duerst: create a list of gaps in URL standard
  3. addison: remind @duerst to create a list of gaps in URL standard
Minutes manually created (not a transcript), formatted by scribe.perl version 248 (Mon Oct 27 20:04:16 2025 UTC).

Diagnostics

Succeeded: s/directionally rtl/directionally ltr/

Succeeded: s/thiknn/think

Succeeded: s/canbe/can be

Succeeded: s/theh/the/

Succeeded: s/apot/a part

Succeeded: s/issue/PR/

Succeeded: s/directions/directions (in section 9.4)

Succeeded: s/Firefox probably doesn't want to add a checkbox to ignore accents./Firefox probably doesn't want to add a checkbox to fold numbering systems, but it could be treated as an accent difference./

Succeeded: s/then non-spacing accents are not ignored/then accents that are analyzed to form a separate base letter are not ignored/

Succeeded: s/you're/your

Succeeded: s/word/words/

Succeeded: s/ffonts/fonts/

Succeeded: s/tryiing/trying

Maybe present: boby, r12a, xfq

All speakers: addison, Andreu, Bert, bobby, boby, eemeli, florian, martin, r12a, xfq

Active on IRC: addison, atsushi, Bert, Bert`, bobby, eemeli, florian, hsivonen, r12a, r12a-again, r12a_, xfq