ChangeProposal/ISSUE-194/TranscriptURL V2

From HTML WG Wiki
Jump to: navigation, search

Introduction of a @transcript=URL attribute

Author: Media Subgroup of HTML Accessibility Task Force

Discussions by: John Foliot, Janina Sajka, Edward O'Connor, Eric Carlson, Charles McCathieNevile, Silvia Pfeiffer

Editors: Silvia Pfeiffer, John Foliot


This is a proposal to address the need for programmatically linked video and audio transcripts in HTML5 through introduction of a @transcript attribute on HTML5 media elements that contains a URL (preferably an absolute URL).

It is based on an analysis of use cases for video transcripts as provided by the TranscriptElement CP. An agreement was reached to leave support for interactive transcripts (UC3) to HTML.Next. Further, UC4 can be solved in the way suggested by the transcript-IDREFs CP.

The remaining use cases are proposed to be solved by a combination of aria attributes and a new @transcript attribute for media elements (particularly for the embedding case).


Issue 194 asks for a mechanism for associating a full transcript with an audio or video element.

In this CP we analyze the different use cases for transcripts and the need for programmatic linkage to the media elements. We then provide markup examples that solve the use cases, thus identifying the gap that remains to solve with new markup.

The use cases

UC1: A full text transcript is provided with the media resource in a separate but linked resource.


UC2: A full text transcript is provided as text on the same page of the media resource.

Examples of non-timed transcripts published underneath the video on-page:

Further, these use cases need to satisfy the requirements listed in the transcript-IDREFs CP.

The Need of programmatic linkage between the video and its transcript

Before addressing the two mentioned use cases, it is important to understand what kind of programmatic linkage is desirable.

What is not desirable:

  • rendering of the transcript inside the boundaries of the video player (e.g. as an alternative to the video or an overlay within the video boundaries) - there is not sufficient space and it removes the possibility to watch the video while reading the transcript.
  • having a button on the video player that links away from the video to a page that doesn't also contain the video - it removes the possibility to watch the video while reading the transcript.
Example: (poor experience when following the off-page link)
  • having a button on the video element that scrolls down to where the transcript is displayed - it takes the eyes off the video.

What is desirable/acceptable:

  • the transcript is rendered outside the video player, e.g. in a different region on the same page near the video player.
  • having a link underneath the video player that allows to download the transcript and read it in parallel to watching the video.
Example: (note the download link)
  • having a button underneath the video player that opens a section of the Web page that contains the transcript nearby.
  • having a button on the video player that links to a page that contains the video itself with the transcript.
Example: almost all embeddable players link back to their original page where transcripts reside (e.g. YouTube, DotSub)
  • having a button on the video player that unhides a section on the page to reveal the transcript nearby. Note that this use case is typically identical to the button underneath the video player, since the video player in this instance is made up of more complex elements that just look like they are part of the video player.


Some trends are revealed in the above examples that influence the required linkage between video and transcript.

The majority of use cases shows a visible transcript on the same page as the video player.

This is the preferred publishing means of transcripts. It makes it easy for the user to consume the transcript together with the video. This works for all users.

There are some different ways of rendering included in this case:

  • text in plain view
 <video poster="poster.jpg" controls aria-label="video with transcript" aria-describedby="transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 <h4 id="transcript">Transcript</h4>
   This is where the full transcript goes.

COMMENTS: this is a misappropriation of aria-describedby and will result in a horrible user-experience for non-sighted users today and in the immediate future. Currently, in screen readers that support aria-describedby the 'description' is concatenated to the other accessible data (i.e. accessible-name as provided by the aria-label). Even if a switching mechanism is provided that allowed for the end-user to query or not query the aria-describedby 'description', a Transcript is not a description of the media resource, it is an alternative rendering/format of the media resource. The description should describe the movie (for example, accessible name/label: "Oceans 11" accessible-description: "Danny Ocean (George Clooney) wants to score the biggest heist in history. He combines an eleven member team, including Frank Catton, Rusty Ryan and Linus Caldwell. Their target? The Bellagio, the Mirage and the MGM Grand. All casinos owned by Terry Benedict. It's not going to be easy, as they plan to get in secretly and out with $150 million." - and not " Scene 1: Rusty Ryan is seen walking across the room. ['yada yada yada'])

  • text in plain view, but rendered from a separate html document
 <video poster="poster.jpg" controls aria-label="video with transcript" aria-describedby="transcript"> 
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 <h4 id="transcript">Transcript</h4>
 <iframe src="transcript.html">
  • button toggles the text in/out with JS
 <video poster="poster.jpg" controls aria-label="video with transcript" aria-describedby="transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 <button id="transcript">Click to view transcript</button>
 <div id="unhide_on_click" hidden>
     This is where the full transcript goes.
  • link to open the transcript in a separate window/tab
 <video poster="poster.jpg" controls aria-label="video with transcript" aria-describedby="transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 <a id="transcript" target="_blank" href="transcript.html">Transcript (HTML)</a>
  • link to download the transcript
 <video poster="poster.jpg" controls aria-label="video with transcript" aria-describedby="transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 <a id="transcript" href="transcript.doc">Download Transcript (doc)</a>

COMMENTs: All of these other examples repeat the problem of using aria-describedby as the means of linking the Transcript to the media resource programmatically, and so do not meet the user-requirements of non-sighted users (or any user relying on ARIA-aware technologies)

For the sighted user, there is no programmatic linkage required, since they can immediately discover whether a transcript is available or not and consume it adequately.

For the vision-impaired user, the screen reader should make an announcement that a transcript is available. This can be done using @aria-label and other accessibility-only attributes.

COMMENT: There is currently no ARIA attribute that solves this use case, nor any other "accessibility-only" attribute, which is why the @transcript proposal

No new attribute is required to solve this use case.

The use case where a video has a transcript, but not on the same page: typically happens when embedding

When a publisher decides to publish a video transcript and a video, there is always a page that contains both. Sometimes the transcript is behind a (download) link, but there is always a Web page that connects the two. This is covered above.

However, when such videos can also be embedded on other sites (like YouTube videos or DotSub videos), the video player can end up being the only thing that moves to the new page, because the publisher wants to avoid the space requirements.

The DotSub player has a version where a richer HTML snippet than just the video player is embedded and this richer HTML snippet also embeds the transcript. However, there is a video-only embed player and that doesn't have a link or reference to the transcript. The YouTube player similarly doesn't have a link to the transcript.

Both the DotSub and the YouTube video player, however, have a link on their video player that links back to the video's home page, which contains (amongst other things) the transcript.

In order for transcripts to remain discoverable in this situation, we need a means to take the link to the transcript along with the video into the new site. This may or may not result in a visual representation of the link in the video controls. In either case, it is important that the transcript remains discoverable.

A new attribute is suggested to solve this use case: @transcript=URL .

 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="">
   <source type="video/mp4" src="video.mp4">
   <source type="video/webm" src="video.webm">

The URL should preferably be an absolute URL, such that it survives embedding. It should preferably link to a page that contains the video and the transcript together - linking to pages that only contain the transcript should be avoided.


The introduction of a new transcript attribute on media elements should create a transcript link that all users can use, since blind users and sighted users are affected equally.

Different options for rendering are:

  • Browsers can decide to include the URL in the controls bar rendered of a video or audio element when player controls are active. This could be a button or could be a menu entry in a settings menu.
  • Web developers should similarly be encouraged to add a link in any custom media element controls that they create.
  • Browsers can decide to include the URL in the context menu of the media element.
  • Screen readers should announce the availability of the transcript URL and provide the user with a means to follow the link.

Preferably all these options will be made available.


N.B. The spec changes described below are intended to fully describe the sorts of changes necessary, but the exact form of the changes to be made are left to the discretion of the editor(s). (This is not a diff that can be blindly applied to the specification. Should the editor(s) find this description difficult to apply unambiguously, the author of this Change Proposal volunteers to work with them and the WG to resolve any such ambiguities identified.)

New section on transcripts

Add a section defining the @transcript attribute.

Transcripts for media elements may be provided, either directly in the text of the page, indirectly by linking to an external document with an <a> element, or by transclusion with an <iframe> element.

Where a transcript or transcript link is not visible on the page, but still available - e.g. in the case of embedded video - a @transcript attribute on the media element may be used.

The @transcript attribute may be specified to indicate a different Web location at which a transcript for the media element is available. This should preferably be a Web page that also contains the video for a better viewing experience.

If the attribute is specified, the attribute's value is a URL.

Modifications to the-video-element and the-audio-element

attribute DOMString transcript;

  • Add "transcript" to the list of common media element attributes below the IDL:

The media element attributes, src, preload, autoplay, mediagroup, loop, muted, controls, and transcript apply to all media elements.

The transcript content attribute on media elements gives the address of a Web resource that contains a text transcript for the media element. The attribute, if present, must contain a valid non-empty URL potentially surrounded by spaces.

The transcript IDL attribute on media elements must reflect the content attribute of the same name.

media . transcript
Returns the address of the resource containing the transcript.
Returns the empty string when there is no transcript address.

  • Further non-normative text should be added along the lines of:

The transcript content attribute should preferably contain an absolute URL to a Web page that contains both the media element and the transcript in plain text. This will ascertain that users can consume the transcript together with the media element's content. An absolute URL further reduces the risk of loosing the reference when embedding the media element's code.

This user interface should include features to begin playback, pause playback, seek to an arbitrary position in the content (if the content supports arbitrary seeking), change the volume, change the display of closed captions or embedded sign-language tracks, select different audio tracks or turn on audio descriptions, link to transcripts and show the media content in manners more suitable to the user (e.g. full-screen video or in an independent resizable window). Other controls may also be made available.

Even when the attribute is absent, however, user agents may provide controls to affect playback of the media resource (e.g. play, pause, seeking, and volume controls), and to link to a transcript but such features should not interfere with the page's normal rendering. For example, such features could be exposed in the media element's context menu.


Positive Effects

  • By programmatically associating transcripts with media elements, we enable users, both assistive technology users and otherwise, to more reliably be made aware of existing transcripts for media resources.
  • It's easy to update existing content to use this markup pattern, so it's easy for authors to adopt this technique.
  • Where multiple transcripts in different languages are available, the linked Web page should contain a means to select between the different transcripts and allow watching the video while reading one of the transcripts. This proposal builds on existing HTML features to make this possible.
  • For UAs that don't support the <video> or <audio> elements, transcripts should be linked to inside the media element. Thus this proposal makes use of existing fallback mechanisms.*

Negative Effects

  • An additional attribute is added to media elements, which browser vendors have to support.

Conformance Class Changes

  • The @transcript attribute is allowed on <audio> and <video> elements.


  • UAs might not implement this mechanism, thus causing us to drop it from the specification in due course. However, since video and audio elements have controls and UAs are in the process of implementing menus to support captions, audio descriptions, and chapters, adding an additional menu entry may not be as objectionable.
  • Authors might not adopt this mechanism. Since the technique is simple, this risk is low.

Obsoleting Change Proposals

This Change Proposal obsoletes the following CPs:

These CPs may remain (TBD):

Original Bug:

  • Bug 12964 - <video>: Declarative linking of full-text transcripts to video and audio elements