W3C

- DRAFT -

Silver XR Subgroup

17 Aug 2020

Attendees

Present
jeanne, CharlesHall, Joshue108, Crispy, bruce_bailey, kirkwood
Regrets
Chair
MikeCrabb
Scribe
Joshue108

Contents


<CharlesHall> #silver

<jeanne> https://www.w3.org/2017/08/telecon-info_silver-xr

any changes to Functional Outcomes based on Spencer's presentation?

<jeanne> https://w3c.github.io/silver/subgroups/xr/captioning/functional-outcomes.html

<CharlesHall> 1) Translates speech and key sound effects into alternative formats (e.g. captions) so media can be understood when sound is unavailable or limited

<CharlesHall> Conveys information about the sound in addition to the text of the sound (for example, sound source, duration, and direction) so users know the necessary information about the context of the sound in relation to the environment it is situated in

<CharlesHall> they are written like testable statements for people

<CharlesHall> Joshue108: technology should be mediating complexity

CH: To take Joshs point and to get back to the question - I'd say no.

There are ways to meet the functional need that isn't persistent.

JK: Regulating speed of info coming in is good - but up to the user agent also.

<CharlesHall> Provides captions and caption meta-data in alternative formats (for example, second screen or braille display) to allow users the opportunity to move caption and meta-data to alternative displays. For example, this benefits users without sound and vision, users who need assistive technology to magnify portions of the view, or users who have limited reach.

<CharlesHall> Provides customisation of caption style and position to support people with limited vision or color perception. Customisation options can benefit all users.

<CharlesHall> https://w3c.github.io/silver/subgroups/xr/captioning/functional-outcomes.html

<jeanne> Outcome 1: We Need Captions

<jeanne> 1) Translates speech and key sound effects into alternative formats (e.g. captions) so media can be understood when sound is unavailable or limited. User agents must support the display and control of captions.

+1

<jeanne> 1) Translates speech and key sound effects into alternative formats (e.g. captions) so media can be understood when sound is unavailable or limited. User agents support the display and control of captions.

<kirkwood> applications?

<jeanne> 1) Translates speech and key sound effects into alternative formats (e.g. captions) so media can be understood when sound is unavailable or limited. User agents and APIs support the display and control of captions.

<CharlesHall> 5) Provides customisation of caption timing to support people with limited manipulation, strength, or cognition

JS: As the XAUR updates we should start sketching out guidelines based on the XAUR

There is also the COGA work in sub group.

CH: There is also location or context sensitive captioning etc

That's not specified there

CH: THere are triggers.

JS: We could add this as editors note?

CH: Thats fine

Writing tests for Captioning and XR

<jeanne> https://www.w3.org/WAI/WCAG21/quickref/#captions-prerecorded

RESOLUTION: change FO1 to include user agents and API. 1) Translates speech and key sound effects into alternative formats (e.g. captions) so media can be understood when sound is unavailable or limited. User agents and APIs support the display and control of captions.

RESOLUTION: change FO#2 to include distance. Conveys information about the sound in addition to the text of the sound (for example, sound source, duration, distance and direction) so users know the necessary information about the context of the sound in relation to the environment it is situated in

<CharlesHall> Joshue108: context sensitive reflow

OK - just to expand - customisable context sensitive reflow of captions, subtitles and text content in XR environments.

<jeanne> FO#1 - Silver XR needs to define what is a key sound effects, recognizing that it is largely an artistic decision that will vary by the use. An videogame that protects against cheating would need different key sound effects from an educational video exploring an enviroment.

<CharlesHall> Functional Outcome 1, Need Captions. Technique: provide captions via a track. Test: enable captions in the environment and verify they are present.

<jeanne> FO#2: Meta data of sound effects: - Disney has examples of distance and direction of sound

<bruce_bailey> https://www.fcc.gov/consumers/guides/closed-captioning-television

<jeanne> Jeanne notes that the above link is for FO#1

<bruce_bailey> Accurate: Captions must match the spoken words in the dialogue and convey background noises and other sounds to the fullest extent possible.

<bruce_bailey> Synchronous: Captions must coincide with their corresponding spoken words and sounds to the greatest extent possible and must be displayed on the screen at a speed that can be read by viewers.

<bruce_bailey> Complete: Captions must run from the beginning to the end of the program to the fullest extent possible.

CH: One of the ways to do that is to use text

<bruce_bailey> Properly placed: Captions should not block other important visual content on the screen, overlap one another or run off the edge of the video screen.

<jeanne> FO#4 Context sensitive reflow - the size and directionality is maintained even if the user moves or changes the size.

<bruce_bailey> Phrase we use in 508 is "information and data"

+1 to Charles

JOC: We could add a new Outcome: We need text descriptions of sound effects, and then Outcome 3: We need more advanced meta data descriptions of sound effects.

https://www.w3.org/TR/raur/

<CharlesHall> Functional Outcome 1, Need Captions. Technique: provide captions via a track. Test: enable captions in the environment and verify they are present.

<CharlesHall> thanks

<kirkwood> had to drop for a t0

<kirkwood> 10

Summary of Action Items

Summary of Resolutions

  1. change FO1 to include user agents and API. 1) Translates speech and key sound effects into alternative formats (e.g. captions) so media can be understood when sound is unavailable or limited. User agents and APIs support the display and control of captions.
  2. change FO#2 to include distance. Conveys information about the sound in addition to the text of the sound (for example, sound source, duration, distance and direction) so users know the necessary information about the context of the sound in relation to the environment it is situated in
[End of minutes]

Minutes manually created (not a transcript), formatted by David Booth's scribe.perl version (CVS log)
$Date: 2020/08/17 14:02:38 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision of Date 
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: Irssi_ISO8601_Log_Text_Format (score 1.00)

Present: jeanne CharlesHall Joshue108 Crispy bruce_bailey kirkwood
No ScribeNick specified.  Guessing ScribeNick: Joshue108
Inferring Scribes: Joshue108

WARNING: No date found!  Assuming today.  (Hint: Specify
the W3C IRC log URL, and the date will be determined from that.)
Or specify the date like this:
<dbooth> Date: 12 Sep 2002

People with action items: 

WARNING: Input appears to use implicit continuation lines.
You may need the "-implicitContinuations" option.


WARNING: IRC log location not specified!  (You can ignore this 
warning if you do not want the generated minutes to contain 
a link to the original IRC log.)


[End of scribe.perl diagnostic output]