This Wiki page is edited by participants of the HTML Accessibility Task Force. It does not necessarily represent consensus and it may have incorrect information or information that is not supported by other Task Force participants, WAI, or W3C. It may also have some very useful information.

Track Kinds

From HTML accessibility task force Wiki

Jump to: navigation, search

1 Background
2 Current status (4/28/11)
3 Questions from 3GPP
4 Proposed answers (proposed by Mark)
5 Kinds in the W3C Editor's draft
6 Kinds that have been suggested
7 New kinds proposed for the specification
8 References

Background

Support for multiple in-band tracks in HTML raises the question of how to determine what each track is. This could be done with external metadata, but this implies the page has a means to obtain and interpret that metadata, which rules out a number of architectures involving generic scripts and web page components.
Increasing interest in and standardization of adaptive streaming means that multiple in-band audio and video tracks become viable, because it is possible to transport only the active tracks.

Current status (4/28/11)

A getKind() method is included in the W3C editor’s draft
A specific set of values is defined in the draft, based on some of the values supported by the Ogg container format. This list is repeated below for reference
3GPP have asked W3C if they will define kind values for accessibility purposes and provided us with a list of the values they have defined for other non-accessibility) purposes
MPEG is expected to align with 3GPP
This page is a work-in-progress and is being actively studied by the HTML a11y working group

Questions from 3GPP

In [1] they ask:

whether our hope to recommend use of W3C ‘role’ names, in our specification, seems achievable and reasonable, in your opinion;
your thinking on the set of names;
your schedule for defining at least a stable initial set of names;
whether you will define a URN to identify the set you define.

Proposed answers (proposed by Mark)

Yes, except that we call them "kinds"
(to be decided - see below)
Iinitial set to be decided by LC deadline (May 22nd?)
(to be decided - don't see why not)

The rationale for defining values in W3C rather than just reflecting what containers support is that scripts should not need to know what kind of container the track came from to be able to interpret the values. The values should be defined in a “container-independent” way.

Kinds in the W3C Editor's draft

These values are returned by the getKind() method on an audio or video track and are documented here : http://dev.w3.org/html5/spec/video.html#dom-tracklist-getkind

Value	Definition
`alternative`	A possible alternative to the main track, e.g. a different take of a song (audio), or a different angle (video)
`description`	An audio description of a video track.
`main`	The primary audio or video track.
`sign`	A sign-language interpretation of an audio track.
`translation`	A translated version of the main track.

Kinds that have been suggested

This section collects all the "kinds" which have been suggested and their source. There may be duplicates and one of the purposes of the table is to identify them.

Value	Source	Definition	Proposal
video/main	Ogg "roles" [2]	The main video track.	Same as `main`
video/alternate	Ogg "roles" [2]	A possible alternative to the main track, e.g. a different camera angle.	Same as `alternative`
video/sign	Ogg "roles" [2]	A sign-language video track.	Same as `sign`
audio/main	Ogg "roles" [2]	The main audio track.	Same as `main`
audio/dub	Ogg "roles" [2]	The audio track but with speech in a different language to the original.	Same as `translation`
audio/audiodesc	Ogg "roles" [2]	An audio description recording for the vision-impaired.	Same as `description`
audio/music	Ogg "roles" [2]	A music track, e.g. when music, speech and sound effects are delivered in different tracks.	Not required
audio/speech	Ogg "roles" [2]	A speech track, e.g. when music, speech and sound effects are delivered in different tracks.	Not required
audio/sfx	Ogg "roles" [2]	A sound effects track, e.g. when music, speech and sound effects are delivered in different tracks.	Not required
main	3GPP Liaison [1]	This stream is part of the main program content.	Same as `main`
supplementary	3GPP Liaison [1]	For a main program that is audio, a supplementary video stream might provide, for example, dynamic graphics	Not required
alternate	3GPP Liaison [1]	Such a stream might provide a different camera viewpoint (and we strongly recommend the provision of further annotations to clarify the nature of the alternative);	Same as `alternative`
commentary	3GPP Liaison [1]		Use `alternative`
dub	3GPP Liaison [1]	An alternative audio stream that contains a non-original language	Same as `translation`
captions	3GPP Liaison [1]		Add to specification
subtitles	3GPP Liaison [1]		Add to specification
sign language	a11y TF		Same as `sign`
Captions	a11y TF	As in burnt in captions	Add to specification
Different camera angles	a11y TT		Same as `alternative`
Video mosaic	a11y TF		Very specific use-case. Handle at page level
Language dub	a11y TF		Same as `translation`
Audio descriptions	a11y TF		Same as `description` above
Commentary	a11y TF	As in director’s commentary	Use `alternative`
Clear audio	a11y TF	See http://www.w3.org/WAI/PF/HTML/wiki/Media_Accessibility_Requirements#Clear_audio	Add to specification
Highcontrast	David Singer	(requirements unclear ?)	Needs further study
Lowcontrast	David Singer	(requirements unclear ?)	Needs further study
Colour blindness adjustments	David Singer	(requirements unclear ?)	Needs further study
Cognitative adjustments	David Singer	(requirements unclear ?)	Needs further study
Repetitive stimulus safe	David Singer	(not clear this is a kind rather than some other kind of property)	Needs further study

New kinds proposed for the specification

This section documents the kinds which are proposed to be added to the specification.

Value	Definition
`captions`	Video with open ("burned in") captions
`subtitles`	Video with open ("burned in") subtitles
`clearaudio`	An alternative audio track in which sounds which are not dialog or other important non-speech information are attenuated. See http://www.w3.org/WAI/PF/HTML/wiki/Media_Accessibility_Requirements#Clear_audio

General rationale for exposing kind values is that they give UAs and JS developers a means to determine what to do with a track. Mostly this means to define whether something should be turned on in addition to a main track, and whether a track contains accessibility and internationalization information to allow them to be turned on by the UA/JS. It may also be used to provide suitable UI elements to enable/disable the track. Human-readable information that might assist user selection would go into the getLabel().

References

[1] http://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_64/Docs/S4-110502.zip

[2] http://wiki.xiph.org/SkeletonHeaders#Role

Retrieved from "https://www.w3.org/WAI/PF/HTML/wiki/index.php?title=Track_Kinds&oldid=3835"

Track Kinds

Contents

Background

Current status (4/28/11)

Questions from 3GPP

Proposed answers (proposed by Mark)

Kinds in the W3C Editor's draft

Kinds that have been suggested

New kinds proposed for the specification

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Navigation

Tools