Transcript synchronization

From Research Questions Task Force

Purpose

This page gives a summary of Task Force discussion of synchronization requirements related to transcripts and other media alternatives that perhaps cannot readily be associated explicitly with media content. The purpose of this effort is to inform possible additions to the capabilities of the Web platform that were developed during the evolution of HTML 5, but which have not been revisited since then.

Concept

According to the Media Accessibility User Requirements, section 2.9, a transcript combines the information provided in captions and video descriptions. That is, the text serves as a complete alternative to the media resource, including its auditory and visual components. An example of a transcript would be the complete script of a stage play or screen play.

Issues Identified by the Task force

  • Programmatic association of transcripts with audio and video content (whether via markup or by embedding the transcript in the media resource).
  • Potential benefits of transcripts to users who are deaf, including but not restricted to those who are deaf-blind.
  • What levels of synchronization have accessibility-related use cases (e.g., paragraph-level, sentence-level, or word-level synchronization of transcript with audio). In addition, the strength of such use cases.
  • The potential for broader benefits of programmatic association and synchronization to the general population, for example in facilitating searching for text in media resources and in navigating directly to the corresponding time points in the audio or video.
  • Synchronization of text in multiple languages with audio or video content (e.g., original text and translation, generally at the phrase/clause/sentence level) - potentially useful in language learning activities.
  • Extension of the synchronization issues considered above beyond transcripts to further scenarios, including the association of musical scores with video or audio content (e.g., in a video recording of an operatic performance), where the score is represented in one of the standard XML formats.
  • Synchronization of video (with audio track), text, and sign language interpretations of both the audio track of the video and of the text.
  • Synchronization of the audio track of a video with multiple sign language interpretations, presented in different sign languages (e.g., American Sign Language, British Sign Language).
  • What types of programmatic association and synchronization are currently supported in applicable standards, and what precedents have been established by recent practice in this regard.

Potential Outcomes

  • Contributions to metadata disclosing the attributes of media resources relevant to accessibility.
  • Possible additions to the capabilities of Web technologies in connection with transcripts and related considerations.