This Wiki page is edited by participants of the HTML Accessibility Task Force. It does not necessarily represent consensus and it may have incorrect information or information that is not supported by other Task Force participants, WAI, or W3C. It may also have some very useful information.

Media Multitrack Issue Tracking

From HTML accessibility task force Wiki
Jump to: navigation, search

open issues in multitrack

We received the following change proposals.

This document is intended to record issues that cut across all of these in order to develop a consensus, and then potentially individual bugs with proposed changes.

Discovery of inband tracks

Currently the proposed changes allow either:

  • Discovery and switching of just the audio tracks. (CP1)
  • Discovery of the audio and video tracks, and mututally exclusive switching between the video tracks, and multiple selection of the audio tracks (CP2)
  • Control of in-band tracks declared with markup and referenced with Media Fragments, but no discovery of in-band tracks (CP3 and 4)


  • It is required to be able to support multiple and exclusive selecting of all track types (within named subsets) - this is addressed in CP 3&4, but only for out-of-band tracks.
  • It is required to discover in-band tracks and their kinds and names, for cases where the page author is not aware in advance or by other means what tracks are provided in the resource

position and styling of tracks

There is no consistency between in band, out of band and mediagroup


  • lift restriction of viewport on video limiting position of text tracks
  • Develop a ::track psuedo selector which is able to target style at any type of track, regardless of whether in-band or out-of-band

Mutually exclusive and multiple tracks

Scenarios exist in all modalities for both exclusive (replacement) switching and multiple (additive) selection of tracks. audio: description, commentary (additive); dubbing, clear audio (replacement) video: sign translation, PiP (additive); high-contrast, alternate-angle (replacement) text: captions (replacement); subtitles (additive)


  • Solutions should support multiple and exclusive selecting of all track types within named subsets bith inband and out of band - this is partially addressed in CP 3&4 for out-of-band tracks.

Consistent handling of track types video, audio, text.


  • Currently there are special rules for handling text types, but this is flawed in that it requires complete download (or failure) to proceed. In order to support live generated streams of captions being synchronised in a multi-track presentation, the network handling of text should be unified with that of binary media tracks.
  • Need to have a mechanism to promote network object sharing for explicitly named tracks (in the markup) which share the same underlying media file.

Example. A media source is the url (possibly expanded into a fragment) and type/media info.

A HTMLNetworkSource is the network connected object that actually gets the bits.

interface HTMLNetworkSource {
 const unsigned short NETWORK_EMPTY = 0;
 const unsigned short NETWORK_IDLE = 1;
 const unsigned short NETWORK_LOADING = 2;
 const unsigned short NETWORK_NO_SOURCE = 3; 
 readonly attribute unsigned short networkState;
          attribute DOMString preload;
 readonly attribute TimeRanges buffered;
 void load();
  // ready state
 const unsigned short HAVE_NOTHING = 0;
 const unsigned short HAVE_METADATA = 1;
 const unsigned short HAVE_CURRENT_DATA = 2;
 const unsigned short HAVE_FUTURE_DATA = 3;
 const unsigned short HAVE_ENOUGH_DATA = 4;
 readonly attribute unsigned short readyState;
 // error state
 readonly attribute MediaError error;
interface HTMLMediaSource {
 readonly attribute HTMLNetworkSource netSrc;
          attribute DOMString src;  // can be a fragment URI
          attribute DOMString type;
          attribute DOMString media;

Media fetching.


  • unification of text track and media track network object
  • sharing of network object between multiple tracks using inband resources
  • Simplify the IDL to unify the three track types.


Base class for all track type objects.

interface HTMLMediaTrack {
 readonly attribute HTMLMediaSource mediaSource;  // may be null ?
 const unsigned short OFF = 0;
 const unsigned short INACTIVE = 1;
 const unsigned short ACTIVE = 2;
         attribute unsigned short mode;
         attribute Function onmodechange;

A text track

interface TextTrack : HTMLMediaTrack {
 readonly attribute TextTrackCueList cues;
 readonly attribute TextTrackCueList activeCues;
          // event raised if a cue becomes active/inactive
          // with target being the activated/deactivated TextTrackCue
          attribute Function oncueenter;
          attribute Function oncueexit;

A video track

interface VideoTrack : HTMLMediaTrack {
   // information about the video here. For element see VideoElement
 readonly attribute unsigned long videoWidth;
 readonly attribute unsigned long videoHeight;

An audio track

interface AudioTrack : HTMLMediaTrack {
          attribute bolean muted;
          attribute double volume;

A collection of tracks

interface MediaTracksCollection {
 readonly attribute unsigned long length;
 getter HTMLMediaTrack (in unsigned long index);
 HTMLMediaTrack getTrackById(in DOMString id);

Linking of timeline approaches

The markup which relates media elements together is not agreed, we have the following potential approaches.

  • inline markup
  • implicit synchronisation for in-band
  • using a timeline attribute to point to master element
  • using a mediagroup attribute on all

A mediagroup provides for a simpler API, however it should:

  • Always exist for any media item evne if it is in an anonymous singleton mediagroup (be created by the implementation)

API to control a collection

  • controls on main element
  • controller object (implied or explicit)

IDL of track types.

  • consistent display state/activity between media and text types

Media issues


  • ad insertion and syncing. A TT or sign video etc would be authored to the basic media

but in many online cases there is a requirement to insert (or prepend) additional unrelated ad media. we need to be able to ignore this material for sync purposes, and possibly switch to other sync material.