Meeting minutes
XAUR has been published - thanks to Josh for all hos work!
Joint working group meetings and break-out sessions planned for TPAC 2021.
Janina: TPAC scheduled is slightly changed over ;ast week
janina: May want to consider a breakout session where we introduce all our work, or perhaps have targeted cross-meetings
janina: the issue was that some may not want to attend an hour-long eeting when they are only interested in one publication
jasonjgw: Perhaps automotive accessibility is one area for a breakout
Ted: Yes, would be supportive of a breakout on automotive accessibility
Ted: sent some proposed language to APA list for a breakout
Joshue108: Will update the wiki to reflect this additional area
jasonjgw: Natural language doesn't have a home for a specific working group
<Raja> If you scheduled it, which week?
<JPaton> John_Paton present+
janina: Proposed breakout on remote meetings
judy: Having a breakout for remote meetings would be good
judy: We will push for a time that will make it able for Scott to attend
<Joshue108> TPAC agenda update with Auto A11y breakout
jasonjgw: Looks like we have agreed to automotive, remote meetings, and synchronization meetup with timed text
jasonjgw: Any others?
Synchronization Accessibility User Requirements.
jasonjgw: We have two new issues which has been opened
jasonjgw: One was a suggestion for a reference, and the other was an issue around ASR accuracy
janina: Would prefer to handle these issues before publication
janina: Section on audio description - we may need to figure out the preferred descriptor as in video description
janina: Could include some connection to text descriptions as well, which is one of the reasons why video description is perhaps better
Joshue108: The comments to the BBC paper was around timing and accuracy
Joshue108: Comment on ASR also around accuracy
Raja: Support elimination of discussion around accuracy
Raja: The use of subtitles is a common term, but there is an SDH acronym used
Judy: we are doing captions differently so that we have both ASR and human cart captioning as well
<Joshue108> [SAUR] Claims about ASR accuracy #229
<Joshue108> https://
<Joshue108> https://
<jasonjgw> Steve: what is stated in the document is strongly grounded in the research - free of editorial opinion/commentary. Additional research would be welcome on this point. There may be additional references.
<Raja> Caption/SDH latency sensitivity for those who listen and read versus those who read only has several studies backing it, including the BBC paper
<jasonjgw> Steve: the research indicates that ASR is much more timely than human captioning.
<Raja> I can find a couple more if that helps
<Raja> what is new is the discussion ASR latency. But since ASR is not usable as primary access, I am uncertain if ASR discussion is worth it
<jasonjgw> Steve notes that synchronization of human-created captions can easily be achieved with prerecorded mateiral, but live enviornments raise issues of latency vs. accuracy. If we only consider latency, then ASR is clearly the better solution, which is why accuracy needs to be considered - revealing the necessity of the trade-off.
<jasonjgw> Steve notes continued improvements in ASR accuracy, as dcumented in research findings.
<jasonjgw> Steve further notes the large data sets available to contemporary machine learning systems (e.g., from smart speaker/speech-based agent usage in large populations).
<jasonjgw> Steve: ASR is limited in its ability to recognize speech patterns in specific individuals and specific accents.
<jasonjgw> Steve considers it important to cite this research and to include the discussion in the document.
<Raja> https://
janina: We need to reference the best language in current use of "subtitles" vs "captions"
<JPaton> Joshua I am
<JPaton> +1 to Raja's suggestion that SDH is likely widely recognised
Raja: ASR is improving, but still problematic, especially over the phone
Raja: Don't believe that ASR is equivalent to a live human
<janina> https://
JPaton: In the UK, subtitling in more standard as the term
JPaton: Speech-to-text has different levels, such as having respeakers which provide better recognition results
<Raja> Respeaking adds a human-to-the-loop
<Raja> When I use interpreters + asr, that's similar.
<Raja> I can't use ASR by itself since it is so variable
Judy: The need for quality metrics for ASR is a question
<Zakim> Judy, you wanted to request that we consider adding reference to quality metrics
<JPaton> JPaton: 3 levels of speech to text: ASR, respeaking with ASR and STTR (palantypists etc)
John to contribute some language around the 3 levels
<Raja> Metrics works for a single environment. For multiple environments, wow do you measure the variability?