W3C

– DRAFT –
Accessible Platform Architectures Working Group Teleconference

25 August 2021

Attendees

Present
janina, jasonjgw, Joshue, Joshue108, Judy, scott_h, scott_h_, SteveNoble
Regrets
-
Chair
jasonjgw
Scribe
SteveNoble

Meeting minutes

XAUR has been published - thanks to Josh for all hos work!

Joint working group meetings and break-out sessions planned for TPAC 2021.

Janina: TPAC scheduled is slightly changed over ;ast week

janina: May want to consider a breakout session where we introduce all our work, or perhaps have targeted cross-meetings

janina: the issue was that some may not want to attend an hour-long eeting when they are only interested in one publication

jasonjgw: Perhaps automotive accessibility is one area for a breakout

Ted: Yes, would be supportive of a breakout on automotive accessibility

Ted: sent some proposed language to APA list for a breakout

Joshue108: Will update the wiki to reflect this additional area

jasonjgw: Natural language doesn't have a home for a specific working group

<Raja> If you scheduled it, which week?

<JPaton> John_Paton present+

janina: Proposed breakout on remote meetings

judy: Having a breakout for remote meetings would be good

judy: We will push for a time that will make it able for Scott to attend

<Joshue108> TPAC agenda update with Auto A11y breakout

jasonjgw: Looks like we have agreed to automotive, remote meetings, and synchronization meetup with timed text

jasonjgw: Any others?

Synchronization Accessibility User Requirements.

jasonjgw: We have two new issues which has been opened

jasonjgw: One was a suggestion for a reference, and the other was an issue around ASR accuracy

janina: Would prefer to handle these issues before publication

janina: Section on audio description - we may need to figure out the preferred descriptor as in video description

janina: Could include some connection to text descriptions as well, which is one of the reasons why video description is perhaps better

Joshue108: The comments to the BBC paper was around timing and accuracy

Joshue108: Comment on ASR also around accuracy

Raja: Support elimination of discussion around accuracy

Raja: The use of subtitles is a common term, but there is an SDH acronym used

Judy: we are doing captions differently so that we have both ASR and human cart captioning as well

<Joshue108> [SAUR] Claims about ASR accuracy #229

<Joshue108> https://github.com/w3c/apa/issues/229

<Joshue108> https://raw.githack.com/w3c/apa/main/saur/#captions-in-live-media

<jasonjgw> Steve: what is stated in the document is strongly grounded in the research - free of editorial opinion/commentary. Additional research would be welcome on this point. There may be additional references.

<Raja> Caption/SDH latency sensitivity for those who listen and read versus those who read only has several studies backing it, including the BBC paper

<jasonjgw> Steve: the research indicates that ASR is much more timely than human captioning.

<Raja> I can find a couple more if that helps

<Raja> what is new is the discussion ASR latency. But since ASR is not usable as primary access, I am uncertain if ASR discussion is worth it

<jasonjgw> Steve notes that synchronization of human-created captions can easily be achieved with prerecorded mateiral, but live enviornments raise issues of latency vs. accuracy. If we only consider latency, then ASR is clearly the better solution, which is why accuracy needs to be considered - revealing the necessity of the trade-off.

<jasonjgw> Steve notes continued improvements in ASR accuracy, as dcumented in research findings.

<jasonjgw> Steve further notes the large data sets available to contemporary machine learning systems (e.g., from smart speaker/speech-based agent usage in large populations).

<jasonjgw> Steve: ASR is limited in its ability to recognize speech patterns in specific individuals and specific accents.

<janina> https://developer.mozilla.org/en-US/docs/Web/Guide/Audio_and_video_delivery/Adding_captions_and_subtitles_to_HTML5_video

<jasonjgw> Steve considers it important to cite this research and to include the discussion in the document.

<Raja> https://verbit.ai/the-differences-between-subtitles-closed-captions-and-sdh/

janina: We need to reference the best language in current use of "subtitles" vs "captions"

<JPaton> Joshua I am

<JPaton> +1 to Raja's suggestion that SDH is likely widely recognised

Raja: ASR is improving, but still problematic, especially over the phone

Raja: Don't believe that ASR is equivalent to a live human

<janina> https://raw.githack.com/w3c/apa/main/saur/#caption-synchronization

JPaton: In the UK, subtitling in more standard as the term

JPaton: Speech-to-text has different levels, such as having respeakers which provide better recognition results

<Raja> Respeaking adds a human-to-the-loop

<Raja> When I use interpreters + asr, that's similar.

<Raja> I can't use ASR by itself since it is so variable

Judy: The need for quality metrics for ASR is a question

<Zakim> Judy, you wanted to request that we consider adding reference to quality metrics

<JPaton> JPaton: 3 levels of speech to text: ASR, respeaking with ASR and STTR (palantypists etc)

John to contribute some language around the 3 levels

<Raja> Metrics works for a single environment. For multiple environments, wow do you measure the variability?

Minutes manually created (not a transcript), formatted by scribe.perl version 136 (Thu May 27 13:50:24 2021 UTC).

Diagnostics

Succeeded: s/taht/that

No scribenick or scribe found. Guessed: SteveNoble

Maybe present: JPaton, Raja, Ted