W3C

– DRAFT –
WAI Adapt Task Force Teleconference

20 March 2023

Attendees

Present
EA, janina, Lionel, matatk, Roy
Regrets
-
Chair
Sharon
Scribe
Lionel, matatk

Meeting minutes

[introductions all 'round]

Registry meetings and updates

janina: We've a FPWD of the Registry, which made it possible for Adapt's Symbols module to go to CR.
… This is about marking up general content on the web, including e.g. digital books.
… The Symbols module allows authors to associate a Bliss concept ID with content, and the Registry provides the ability to look up concepts. We have some things to work on, including
… internationalization of the registry (so concepts can be looked up in one's preferred language), and how we can manage updates, and how other symbol sets could be added.
… Today we want to primarily talk about i18n.

Sharon: We have some questiosn around languages.

janina: Our core question is: do the symbols at all change if the langauge changes?
… If so, then we need to ensure the structure of our registry adapts.

Russell: The short answer is no. What can happen is a symbol may be replaced for one reason or another, e.g. we come to a better understanding of cultural biases, and replace a symbol.
… In this case, the replacement symbol will have a new ID.
… The other symbol would be marked as deprecated (which we flag as "OLD").
… There may be synonym symbols (as in other languages).

Shirley: It's the ordering of the symbols in the sentence that would be affected by language.

Annalu: Adding symbols to text is referred to as "symbolization". This is not a language translation exercise, but adding symbols to convey the concepts.
… E.g. the word "match" could be linked to the tool for lighting a candle, or to a football match. The key issue for symbolization is to correctly understand the concepts behind the symbols.
… This is where the idea of "content coding" came about.
… i.e. giving each concept a unique number.

Sharon: We did an exercise where the language changed direction, and the orderign of symbols changed, but would they change directiont too?

Annalu: We've done quite a lot of work involving Hebrew and Arabic. Some symbols _do_ change direction, but some don't.

EA: In particuular, numbers/numeric content. I sent some examples to Lionel. Sometimes the positions of adjecteives and adverbs change too.
… Also, spellings can change, based on gender, and other factors.
… There are definitely different images that would be used based on the cultture (e.g. "friends" would have different people in groups in a symbol).

Annalu: Bliss demonstrates the complexities involved.

EA: Even the issue of whether you show two people (as friends) or three (as friends) is dependent on the details of the particular Arabic word

EA: You also have to simplify the sentence; can't just perform a direct look-up. I sent some examples to Lionel.
… e.g. of a conscent form.
… When talking about Thai, Chinese, etc., you're really talking about different concepts.

Annalu: We're trying to _support_ people to understand the text, via the symbols, not to replace the text.

janina: I'm hearing several issues here that strike me as UA requirements on how you take these and present them to users. Other issues, e.g. AAC representation of some concepts might be radically different from the textual representation.
… I think thsoe are somewhat apart from what we're trying to achieve, which is that content publishers may be able to go in and annotate spans of their content with BCI values, and users that rely on AAC would,
… in addition to having the text, which will still be there, would have the symbols.

janina: Our primary concenrn is for publishers to prescribe the symbols. We may experiment with other approaches like AI in future, but we're focusing in situatiosn where content is being published in some "standard" orthography, and the symbols are an optional extra.

janina: Does this sound like an accurate summary?

Lionel: Thanks all for being here, really important discussion.

Lionel: I suggest we stay with EA's example, as it's clarifying. I understand: we've proposed a registry, as stated, as we feel AAC symbols could be applied more broadly in content.
… Microsoft recently announced their "immersive reader" which you may've heard of, that supplies symbols to support content.
… In the example, there's a difference with the concept of "friends" as to how many would be in the group, and what gender they'd be.
… Our question is how would we add other languages to the concept look-up.
… Thanks Annalu for the concept of "symbolization". So we have a dictionary term "friends" and an entry for "friends" in the registry, which points to a BCI number (whcih could be shown with a Bliss or other symbol).
… E.g. I have content that mentions "friends" and I wish to decorate it with the proper symbol. The registry gives me ID 1234, so I use this. If I am looking up the concept in the registry in Saudi Arabia, in Arabic, how do I add this to the registry, and what number would I get?

Annalu: What we've ben using are the WordNet concept codes. In our work, the concept is the same, but the culture makes it different.

Annalu: We need think of concept, and separate that from the gloss and the culture we're in.

<Zakim> janina, you wanted to note that some languages will have a different word for male friends vs female friends, at least a different inflection

janina: I think this is an excellent point and we need to find a way to manage it accordingly. I think I'm hearing it comes from expectations from the language itself. I think in some cases, the word may be a different word, or may have a different inflection?
… I'm thinking of Slavic languages, for instance. I think we could have the concepts enumerated that way. Not sure if we want to have another index value. I think it's not so much that the symbol changes, but that we have a way to express this in the table of symbols.
… Our next step, having spken with you, will be for APA to reach out to our W3C i18n colleauges.
… All of W3C's specs go through certain reviews (incling i18n, accessibility, security, privacy) so ours will be reviewed too.
… This will include awareness of other cultures.
… As we reach a more solid understanding from you, we'll involve the W3C i18n group, and you're welcome to join in that process. This is a new application of this tech.

matatk: Seems we can have ways to extend the registry to account, for some concepts, for things like language (implying cultture), cardinality, etc. Appareciate your help in explaining these considerations, and ensuring we cover them.

Russel: Thinking of Bliss symbol for "friend", it's gender-neutral, "person" + "goodness". There are ways in Bliss to account for numbers changing.
… Also technology changes. E.g. "we taped that" to mean "we recorded it". A Bliss symbol may've had a pictograph of a tape recorder, or telephone, but they will change over time.

EA & matatk: E.g. floopy disk save icon

janina: None of this is static.

Shirley: The complexity of language itself... as soon as you encode a symbol for something, you have framed it in a certain way.

EA: AAC is expressed in 2D but there's so much more information/context.

Lionel: Focusing on "friends" and tackling the internationalization of the word, for lookup in our registry.
… There could, as you say, in Arabic, be different words: one for two friends, one for three.
… Accepting that the symnbols work in context, I feel we're on safe ground, as we're relying on the content _author_ to pick the symbol.

Lionel: In sign language, "friends" in one language can mean "married" in another.

Lionel: So if you're using the AT in an Arabic country, you can use the most apt symbol, but in other places it would use a different one.

janina: There may be gaps here, but we can tackle those another day. I think we are approachaing the 90% use case.

janina: We want to evolve the spec(s) in future.

Annalu: Symbol sets will need to order themselves in a certain way in order to be used confidendly in this fashion.

Annalu: Other companies use systems which may match to the Bliss numbers.

EA: Agree with Annalu; we know that to map the symbol sets, even those that are free and available under CC licence, each has their unique ID for _their_ labels and _their_ glasses. They're not mapped against Bliss concepts. It's a job that can be done, and AI may be able to help.
… They would need to be mapped, and checked.
… As Annalu says, it's a task that would be marvellous to achieve. The difficulty will be with commercial sets; the companies would need to open up their databases.
… They're mainly set to work on someone's specific device.

Lionel: Excited by this; sounds like you're saying that starting with BCI would be good, but there may be an intermediate step to map to other sets and, whilst it's challenging, it's do-able.

EA: Yes, and e.g. Easy Reading EU project has shown this is possible. An AAC web browser project was also done several years ago.

Lionel: I propose that we can internationalize the gloss, as you term it (we should use the right term, we were calling this a "label").

Lionel: We can translate the glosses into other languages. We can work on an interim lookup.

From Annalu: https://discovery.dundee.ac.uk/en/publications/building-a-lexical-database-for-an-interactive-joke-generator

<Sharon> https://discovery.dundee.ac.uk/en/publications/building-a-lexical-database-for-an-interactive-joke-generator

janina: Next steps? The next one seems to be outreach to i18n, a "get acquainted" meeteing. Everyone here is welcome to particiapte. They'll help us get this right, and we'll need their sign-off.

Sharon: Thanks everyone, I've learnt a lot, and we really appreciate your time.

Minutes manually created (not a transcript), formatted by scribe.perl version 210 (Wed Jan 11 19:21:32 2023 UTC).

Diagnostics

Succeeded: s/examplse/examples

Maybe present: Annalu, Russel, Russell, Sharon, Shirley

All speakers: Annalu, EA, janina, Lionel, matatk, Russel, Russell, Sharon, Shirley

Active on IRC: EA, janina, Lionel, matatk, Roy, Sharon