This Wiki page is edited by participants of the HTML Accessibility Task Force. It does not necessarily represent consensus and it may have incorrect information or information that is not supported by other Task Force participants, WAI, or W3C. It may also have some very useful information.
TTML Mapping to Requirments
Technical Requirements Mapping for TTML
TTML is a self contained XML format for describing the synchronided dispay of formatted text that can be associated with a given timeline.
its mapping to the media accessibility requirements (initial draft) are:
|Reference||Requirement Brief Description||Types of Technologies affected||Technology addresses requirement|
|(DV-1)||Provide an indication that descriptions are available, and are active/non-active.||audio rendering, user interface, API, user preferences, markup||TTML does not specifically address audio rendering, but it has markup to indicate text as description, and is capable of embedding SSML or similar.|
|(DV-2)||Render descriptions in a time-synchronized manner, using the media resource as the timebase master.||audio rendering, synchronization||TTML rendering can be based on a media resource using the media clock mode.|
|(DV-3)||Support multiple description tracks (e.g., discrete tracks containing different levels of detail).||API, multitrack, synchronization, navigation, markup, user interface|| Each track can be a separate TTML file, or enclosed as separate div's in a single file
|(DV-4)||Support recordings of real human speech as part of a media resource, or as an external file.||synchronization, multitrack, API, markup|| N/A
|(DV-5)||Allow the author to independently adjust the volumes of the audio description and original soundtracks.||audio rendering, API, user interface|| N/A
|(DV-6)||Allow the user to independently adjust the volumes of the audio description and original soundtracks, with the user's settings overriding the author's.||user preferences, API, user interface|| N/A
|(DV-7)||Permit smooth changes in volume rather than stepped changes. The degree and speed of volume change should be under provider control.||audio rendering, user interface, API|| N/A
|(DV-8)||Allow the author to provide fade and pan controls to be accurately synchronised with the original soundtrack.||audio rendering, user interface|| N/A
|(DV-9)||Allow the author to use a codec which is optimised for voice only, rather than requiring the same codec as the original soundtrack.||codecs|| N/A
|(DV-10)||Allow the user to select from among different languages of descriptions, if available, even if they are different from the language of the main soundtrack.||markup, API, user interface|| N/A
|(DV-11)||Support the simultaneous playback of both the described and non-described audio tracks so that one may be directed at separate outputs (e.g., a speaker and headphones).||user interface, audio rendering|| N/A
|(DV-12)||Provide a means to prevent descriptions from carrying over from one program or channel when the user switches to a different program or channel.||synchronization|| N/A
|(DV-13)||Allow the user to relocate the description track within the audio field, with the user setting overriding the author setting. The setting should be re-adjustable as the media plays.||user preferences, audio rendering|| N/A
|(DV-14)||Support metadata, such as copyright information, usage rights, language, etc.||cue format, in-band cues, multitrack|| TTML supports arbitrary metadata
|Text video description|
|(TVD-1)||Support presentation of text video descriptions through a screen reader or braille device||cue format, audio rendering, visual rendering, synchronization, API, markup, speech synthesis|| TTML has markup to indicate text as description, and is capable of embedding SSML or similar.
|(TVD-1) cont||support playback speed control and voice control and synchronization points with the video.||user interface, speech synthesis, cue format, audio rendering, synchronization, API, markup||TTML is inherently synchronised with external media.|
|(TVD-2)||TVDs need to be provided in a format that contains start time, text per description cue (the duration is determined dynamically, though an end time could provide a cut point)||cue format||TTML would need to adopt the excl time container mode for this.|
|(TVD-2) cont||TVDs need to be provided in a format that contains possibly a speech-synthesis markup to improve quality of the description||speech synthesis||TTML can embed SSML in its own namespace/|
|(TVD-2) cont||TVDs need to be provided in a format that contains accompanying metadata labeling for speakers, language, etc.||cue format, audio rendering, speech synthesis||TTML contains specific metadata for this, and can add arbitrary metadata.|
|(TVD-3)||Where possible, provide a text or separate audio track privately to those that need it in a mixed-viewing situation, e.g., through headphones.||audio rendering||it could be possible to use the TTML region mapping for descriptions to achieve this.|
|(TVD-4)||Where possible, provide options for authors and users to deal with the overflow case: continue reading, stop reading, and pause the video.||cue format, rendering, user interface||TTML timing can describe these situations.|
|(TVD-5)||Support the control over speech-synthesis playback speed, volume and voice, and provide synchronisation points with the video.||user interface, audio rendering, speech synthesis, synchronization|| N/A (use embedded SSML)
|Extended video descriptions|
|(EVD-1)||Support detailed user control as specified in (TVD-4) for extended video descriptions.||cue format, rendering, user interface, API||N/A|
|(EVD-2)||Support automatically pausing the video and main audio tracks in order to play a lengthy description.||rendering, user interface, API||N/A|
|(EVD-3)||Support resuming playback of video and main audio tracks when the description is finished.||rendering, API||N/A|
|(CA-1)||Support speech as a separate, alternative audio track from other sounds.||synchronization, multitrack, API||N/A|
|(CA-2)||Support the synchronisation of multitrack audio either within the same file or from separate files - preferably both.||synchronization, multitrack, API, markup||N/A|
|(CA-3)||Support separate volume control of the different audio tracks.||user interface, API||N/A|
|(CA-4)||Support pre-emphasis filters, pitch-shifting, and other audio-processing algorithms.||audio rendering, API, user interface||N/A|
|Content navigation by content structure|
|(CN-1)||Provide a means to structure media resources so that users can navigate them by semantic content structure, e.g. through adding a track to the video that contains navigation markers (in table-of-content style). This means must allow authors to identify ancillary content structures. Support keeping all media representations synchronised when users navigate.||Multitrack, synchronisation, api, markup, navigation, user interface.||N/A|
|(CN-2)||The navigation track should provide for hierarchical structures with titles for the sections.||cue format, multi track, synchronisation, api, markup, navigation, user interface.||N/A|
|(CN-3)||Support both global navigation by the larger structural elements of a media work, and also the most localized atomic structures of that work, even though authors may not have marked-up all levels of navigational granularity.||Multitrack, synchronisation, api, markup, navigation, user interface.||N/A|
|(CN-4)||Support third-party provided structural navigation markup.||cue format, Multitrack, synchronisation, api, markup, navigation, user interface.||N/A|
|(CN-5)||Keep all content representations in sync, so that moving to any particular structural element in media content also moves to the corresponding point in all provided alternate media representations (captions, described video, transcripts, etc) associated with that work.||Multitrack, synchronisation, api, markup, navigation, user interface.||N/A|
|(CN-6)||Support direct access to any structural element, possibly through URIs.||Multitrack, api||N/A|
|(CN-7)||Support pausing primary content traversal to provide access to such ancillary content in line.||Multitrack, synchronisation, api, user preferences, markup, user interface||N/A|
|(CN-8)||Support skipping of ancillary content in order to not interrupt content flow.||Multitrack, synchronisation, api, user preferences, markup, user interface||N/A|
|(CN-9)||Support access to each ancillary content item, including with "next" and "previous" controls, apart from accessing the primary content of the title.||Multitrack, synchronisation, api, markup, navigation, user interface||N/A|
|(CN-10)||Support that in bilingual texts both the original and translated texts can appear on screen, with both the original and translated text highlighted, line by line, in sync with the audio narration.||cue format, in-band cues, multi track, synchronisation, api, rendering (video), internationalization, navigation, user preferences, markup, user interface||N/A|
|(CC-1)||render time-synchronized cues along the media timebase||cue format, in-band cues, synchronisation, API, rendering, user preferences||TTML can use media time. And can specify cues at frame intervals, or in absolute time.|
|(CC-2)||allow erasures, i.e. times when no text cues are active||cue format, in-band cues|| example:
<p begin='1s' dur='1s'>c1 </p> <p begin='3s' dur='1s'>c2 </p>
Regions have a property to determine whether the background should remain visible
|(CC-3)||allow gap-less cues||cue format|| "example:
<p dur='1s'>c1 </p>
<p dur='1s'>c2 </p>
|(CC-4)||specify a character encoding||cue format, internationalisation|| TTML is XML, and as well as the XML set, can specify any IANA registered character encoding
|(CC-5)||positioning on all parts of the screen, inside and outside the video viewport||cue format, rendering, user preferences|| region supports origin and extent attributes. These can be logical e.g. in terms of the video frame
(and values greter than 100% and less than 0% are supported for of video positions), or in pixel precise measures.
|(CC-6)||display of multiple text cues at the same time||cue format, rendering, user preferences|| TTML supports multiple regions, and multiple cues within a region.
|(CC-7)||display of multiple text cues also in ltr or rtl languages||cue format, rendering, internationalisation|| writing direction is specifiyable on a text element basis
|(CC-8)||allow explicit line breaks||cue format, rendering|| TTML supports the |
element as well as the wrap and noWrap attribute.
|(CC-9)||allow a range of font faces and sizes||cue format, rendering, user preferences|| TTML supports the fontFamily and fontSize style property
|(CC-10)||allow background colors and background opacity||cue format, rendering, user preferences|| TTML supports the backgroundColor style property on any element, which can be specified as an argb value. It also supports an overall opacity on regions
|(CC-11)||allow text colors and opacity||cue format, rendering, user preferences|| TTML supports the color style property on any element, which can be specified as an argb value.
|(CC-12)||allow thicker outline or a drop shadow on text||cue format, rendering, user preferences|| TTML supports the fontOutline style property which has both a thickness and a shadow radius.
|(CC-13)||enable/disable continuation of background color on erasures||cue format, rendering, user preferences|| region supports the showBackground style property which controls when region backgrounds should be present. This is animatable.
|(CC-14)||allow cue text rendering effects, e.g. paint on, pop on, roll up, appear||cue format, rendering, user preferences|| TTML timing can support these down to the character level if required, examples
of each are given in the W3C TTML test suite.
|(CC-15)||support bottom 1/12 rendering rule||cue format, rendering, user preferences|| regions can be specified using a percentage of the video frame. Text alignment within
regions can be to the after edge.
|(CC-16)||support mixed language cues||cue format, rendering, internationalisation|| TTML supports xml:lang on any text element, and is unicode based.
|(CC-17)||support mixed language cue files||cue format, rendering, internationalisation, API|| TTML supports xml:lang on any text element, including the root (where it is required) and is unicode based.
|(CC-18)||support furigana, ruby and other common typographical conventions||cue format, rendering, internationalisation|| TTML styling can support the visual effects. It is XML based and can include foreign namespaces
for namespace markup.
|(CC-19)||support full range of typographical glyphs, layout and punctuation marks||cue format, rendering, internationalisation|| TTML supports full unicode and is font based.
|(CC-20)||support semantic markup of mixed language cues||cue format, rendering, internationalisation, speech synthesis|| TTML supports xml:lang on any text element, and is unicode based.
|(CC-21)||support semantic markup of different speakers||cue format, rendering, speech synthesis|| TTML has the <ttm:actor> element to describe different speaking agents and the
ttm:actor attribute to refer to these.
|(CC-22)||support the same API for in-band and external cue formats||cue format, API|| TTML is in active useon the internet today for both inband (MediaRoom) and out of band
captioning (e.g. BBC iplayer)
|(CC-23)||synchronized display of cue text and media data||cue format, API, synchronisation|| TTML can use media time. And can specify cues at frame intervals, or in absolute time (ms).
|(CC-24)||support user activation/deactivation of cue tracks||API, synchronisation, user preferences||N/A - this is a user agent reqt.|
|(CC-25)||support edited and verbatim caption alternatives||API, synchronisation, user preferences||These can be as separate files, or as separate divs within one file|
|(CC-26)||support several cue tracks in different languages||API, synchronisation, user preferences||These can be as separate files, or as separate divs within one file|
|(CC-27)||support live captioning||cue format, API, synchronisation, user preferences|| TTML can be produced in real time (e.g. from 608 caption data). There are a variety of mechanisms for delivering data
serially in real time using XML, although this is outside of the scope of the TTML format specifically;
it is in active use for in band systems.
|(ECC-1)||support metadata markup of cue segments||cue format|| TTML has the <metadata> element which can include arbitrary data.
|(ECC-2)||support hyperlinking on cue segments||cue format|| Hyperlinks (e.g in the HTML namespace) could be included in TTML as foreign namespace elements, or as metadata
|(ECC-3)||support extended cue times and overlap handling||cue format, synchronisation, user preferences|| TTML timing can be of arbitrary length. Cues can overlap in time in parallel time containers.
|(ECC-4)||support pausing on extended cue times or parallel display||cue format, synchronisation, user preferences|| Since TTML is based on media time, this requires the media time to be paused and restarted, this
can be handled by proposed HTML specific attribute on cues which causes the player to pause the media.
|(ECC-5)||allow users to specify their reading speed to deal with extended cues||cue format, synchronisation, user preferences|| If TTML is synched to the media, then if the media slows down the cues will too. If the media is
paused the cues will remain indefinetly. TTML could be synched to another external clock if required;
although the trelationship between that clock and the media clock would need to be specified elsewhere.
|(SL-1)||Support sign-language video either as a track as part of a media resource or as an external file.||multitrack, synchronisation, API, rendering||N/A|
|(SL-2)||Support the synchronized playback of the sign-language video with the media resource.||synchronisation||N/A|
|(SL-3)||Support the display of sign-language video either as picture-in-picture or alpha-blended overlay, as parallel video, or as the main video with the original video as picture-in-picture or alpha-blended overlay. Parallel video here means two discrete videos playing in sync with each other. It is preferable to have one discrete <video> element contain all pieces for sync purposes rather than specifying multiple <video> elements intended to work in sync.||user interface, rendering, user preferences, markup||N/A|
|(SL-4)||Support multiple sign-language tracks in several sign languages.||internationalisation||N/A|
|(SL-5)||Support the interactive activation/deactivation of a sign-language track by the user.||user interface, user preferences||N/A|
|(T-1)||Support the provisioning of a full text transcript for the media asset in a separate but linked resource, where the linkage is programatically accessible to AT.||linkage||TTML is capable of long form documents, and animating within them (e.g. fo highlight current word or sentence)|
|(T-2)||Support the provisioning of both scrolling and static display of a full text transcript with the media resource, e.g. in a area next to the video or underneath the video, which is also AT accessible.||linkage, rendering, user interface||This can be controlled using TTML timing, or left to the user agent.|
|Access to interactive controls / menus|
|(KA-1)||Support operation of all functionality via the keyboard on systems where a keyboard is (or can be) present, and where a unique focus object is employed. This does not forbid and should not discourage providing mouse input or other input methods in addition to keyboard operation. (UAAG 2.0 4.1.1)||user interface (NOTE: This means that all interaction possibilities with media elements need to be keyboard accessible; e.g., through being able to tab onto the play, pause, mute buttons, and to move the playback position from the keyboard.)||N/A|
|(KA-2)||Support a rich set of native controls for media operation, including but not limited to play, pause, stop, jump to beginning, jump to end, scale player size (up to full screen), adjust volume, mute, captions on/off, descriptions on/off, selection of audio language, selection of caption language, selection of audio description language, location of captions, size of captions, video contrast/brightness, playback rate, content navigation on same level (next/prev) and between levels (up/down) etc. This is also a particularly important requirement on mobile devices or devices without a keyboard.||user interface, user preferences, API (NOTE: This means that the @controls content attribute needs to provide an extended set of control functionality including functionality for accessibility users.)||N/A|
|(KA-3)||All functionality available to native controls must also be available to scripted controls. The author would be able to choose any/all of the controls, skin them and position them.||API (NOTE: This means that new IDL attributes need to be added to the media elements for the extra controls that are accessibility related.)||N/A|
|(KA-4)||It must always be possible to enable native controls regardless of the author preference to guarantee that such functionality is available and essentially override author settings through user control. This is also a particularly important requirement on mobile devices or devices without a keyboard.||user interface, linkage (NOTE: This could be enabled through a context menu, which is keyboard accessible and its keyboard access cannot be turned off.)||N/A|
|(KA-5)||The scripted and native controls must go through the same platform-level accessibility framework (where it exists), so that a user presented with the scripted version is not shut out from some expected behaviour.||API, linkage (NOTE: This is below the level of HTML and means that the accessibility platform needs to be extended to allow access to these controls. )||N/A|
|Granularity level control for structural navigation|
|(CNS-1)||All identified structures, including ancillary content as defined in "Content Navigation" above, must be accessible with the use of "next" and "previous," as refined by the granularity control.||multitrack, synchronization, api, navigation, markup, user interface||N/A|
|(CNS-2)||Users must be able to discover, skip, play-in-line, or directly access ancillary content structures.||multitrack, synchronization, api, navigation, markup, user interface||N/A|
|(CNS-3)||Users need to be able to access the granularity control using any input mode, e.g. keyboard, speech, pointer, etc.||user interface||N/A|
|(CNS-4)||Producers and authors may optionally provide additional access options to identified structures, such as direct access to any node in a table of contents.||multitrack, synchronization, markup, navigation, user interface||N/A|
|(TSM-1)||The user can adjust the playback rate of the time-based media tracks to between 50% and 250% of real time.||user interface, user preference||N/A|
|(TSM-2)||Speech whose playback rate has been adjusted by the user maintains pitch in order to limit degradation of the speech quality.||user interface||N/A|
|(TSM-3)||All provided alternative media tracks remain synchronized across this required range of playback rates.||synchronisation||N/A|
|(TSM-4)||The user agent provides a function that resets the playback rate to normal (100%).||user interface||N/A|
|(TSM-5)||The user can stop, pause, and resume rendered audio and animation content (including video and animated images) that last three or more seconds at their default playback rate. (UAAG 2.0 4.9.6)||user interface||N/A|
|Production practice and resulting requirements|
|(PP-1)||Support existing production practice for alternative content resources, in particular allow for the association of separate alternative content resources to media resources. Browsers cannot support all forms of time-stamp formats out there, just as they cannot support all forms of image formats (etc.). This necessitates a clear and unambiguous declared format, so that existing authoring tools can be configured to export finished files in the required format.||synchronisation, cue format||N/A|
|(PP-2)||Support the association of authoring and rights metadata with alternative content resources, including copyright and usage information.||cue format, multitrack||N/A|
|(PP-3)||Support the simple replacement of alternative content resources even after publishing. This is again dependent on authoring practice - if the content creator delivers a final media file that contains related accessibility content inside the media wrapper (for example an MP4 file), then it will require an appropriate third-party authoring tool to make changes to that file - it cannot be demanded of the browser to do so.||multitrack, cue format||N/A|
|(PP-4)||Typically, alternative content resources are created by different entities to the ones that create the media content. They may even be in different countries and not be allowed to re-publish the other one's content. It is important to be able to host these resources separately, associate them together through the Web page author, and eventually play them back synchronously to the user.||synchronisation||N/A|
|Discovery and activation/deactivation of available alternative content by the user|
|(DAC-1) (part a)||(a)The user has the ability to have indicators rendered along with rendered elements that have alternative content (e.g., visual icons rendered in proximity of content which has short text alternatives, long descriptions, or captions).||user interface, linkage||N/A|
|(DAC-1) (part b)||(b) In cases where the alternative content has different dimensions than the original content, the user has the option to specify how the layout/reflow of the document should be handled. (UAAG 2.0 3.1.1).||user interface, linkage||N/A|
|(DAC-2)||The user has a global option to specify which types of alternative content by default and, in cases where the alternative content has different dimensions than the original content, how the layout/reflow of the document should be handled. (UAAG 2.0 3.1.2).||rendering, audio rendering, user interface (Note: Media queries have been proposed as a way of meeting this need, along with the use of CSS for layout.)||N/A|
|(DAC-3)||The user can browse the alternatives and switch between them.||user interface, navigation||N/A|
|(DAC-4)||Synchronized alternatives for time-based media (e.g., captions, descriptions, sign language) can be rendered at the same time as their associated audio tracks and visual tracks (UAAG 2.0 3.1.3).||synchronisation, multitrack||N/A|
|(DAC-5)||Non-synchronized alternatives (e.g., short text alternatives, long descriptions) can be rendered as replacements for the original rendered content (UAAG 2.0 3.1.3).||linkage||N/A|
|(DAC-6)||Provide the user with the global option to configure a cascade of types of alternatives to render by default, in case a preferred alternative content type is unavailable (UAAG 2.0 3.1.4).||user preferences||N/A|
|(DAC-7)||During time-based media playback, the user can determine which tracks are available and select or deselect tracks. These selections may override global default settings for captions, descriptions, etc. (UAAG 2.0 4.9.8)||user interface, user preferences||N/A|
|(DAC-8)||Provide the user with the option to load time-based media content such that the first frame is displayed (if video), but the content is not played until explicit user request. (UAAG 2.0 4.9.2)||user interface, (autostart)||N/A|
|Requirements on making properties available to the accessibility interface|
|(API-1)||The existence of alternative-content tracks for a media resource must be exposed to the user agent.||user interface||N/A|
|(API-2)||Since authors will need access to the alternative content tracks, the structure needs to be exposed to authors as well, which requires a dynamic interface.||API||N/A|
|(API-3)||Accessibility APIs need to gain access to alternative content tracks no matter whether those content tracks come from within a resource or are combined through markup on the page.||multitrack, synchronisation, API, linkage||N/A|
|Requirements on the use of the viewport|
|(VP-1)|| It must be possible to deal with three different cases for the relation between the viewport size, the position of media and of alternative content:
If alternative content has a different height or width than the media content, then the user agent will reflow the (HTML) viewport. (UAAG 2.0 3.1.4).
|rendering, user interface, linkage (NOTE: This may create a need to provide an author hint to the Web page when embedding alternate content in order to instruct the Web page how to render the content: to scale with the media resource, scale independently, or provide a position hint in relation to the media. On small devices where the video takes up the full viewport, only limited rendering choices may be possible, such that the UA may need to override author preferences.)||N/A|
|(VP-2)|| The user can change the following characteristics of visually rendered text content, overriding those specified by the author or user-agent defaults (UAAG 2.0 3.6.1). Note: this should include captions and any text rendered in relation to media elements, so as to be able to magnify and simplify rendered text):
||rendering (NOTE: This should be achievable through UA configuration or even through something like a greasemonkey script or user CSS which can override styles dynamically in the browser.)||N/A|
|(VP-3)||Provide the user with the ability to adjust the size of the time-based media up to the full height or width of the containing viewport, with the ability to preserve aspect ratio and to adjust the size of the playback viewport to avoid cropping, within the scaling limitations imposed by the media itself. (UAAG 2.0 4.9.9)||rendering (NOTE: This can be achieved by simply zooming into the Web page, which will automatically rescale the layout and reflow the content.)||N/A|
|(VP-4)||Provide the user with the ability to control the contrast and brightness of the content within the playback viewport. (UAAG 2.0 4.9.11)||user interface (NOTE: This is a user-agent device requirement and should already be addressed in the UAAG. In live content, it may even be possible to adjust camera settings to achieve this requirement. It is also a "SHOULD" level requirement, since it does not account for limitations of various devices.)||N/A|
|(VP-5)||Captions and subtitles traditionally occupy the lower third of the video, where also controls are also usually rendered. The user agent must avoiding overlapping of overlay content and controls on media resources. This must also happen if, for example, the controls are only visible on demand.||rendering (NOTE: If there are several types of overlapping overlays, the controls should stay on the bottom edge of the viewport and the others should be moved above this area, all stacked above each other. )||N/A|
|Requirements on the parallel use of alternate content on potentially multiple devices in parallel|
|(MD-1)||Support a platform-accessibility architecture relevant to the operating environment. (UAAG 2.0 2.1.1)||linkage||N/A|
|(MD-2)||Ensure accessibility of all user-interface components including the user interface, rendered content, and alternative content; make available the name, role, state, value, and description via a platform-accessibility architecture. (UAAG 2.0 2.1.2)||user interface, linkage||N/A|
|(MD-3)||If a feature is not supported by the accessibility architecture(s), provide an equivalent feature that does support the accessibility architecture(s). Document the equivalent feature in the conformance claim. (UAAG 2.0 2.1.3)||??||N/A|
|(MD-4)||If the user agent implements one or more DOMs, they must be made programmatically available to assistive technologies. (UAAG 2.0 2.1.4) This assumes the video element will write to the DOM.||API||N/A|
|(MD-5)||If the user can modify the state or value of a piece of content through the user interface (e.g., by checking a box or editing a text area), the same degree of write access is available programmatically (UAAG 2.0 2.1.5).||API||N/A|
|(MD-6)|| If any of the following properties are supported by the accessibility-platform architecture, make the properties available to the accessibility-platform architecture (UAAG 2.0 2.1.6):
|(MD-7)||Ensure that programmatic exchanges between APIs proceed at a rate such that users do not perceive a delay. (UAAG 2.0 2.1.7).||API||N/A|