ChangeProposal/ISSUE-194/TranscriptURL

From HTML WG Wiki
Jump to: navigation, search


Introduction of a @transcript=URL attribute

Author: Media Subgroup of HTML Accessibility Task Force

Discussions by: John Foliot, Janina Sajka, Edward O'Connor, Eric Carlson, Charles McCathieNevile, Silvia Pfeiffer

Editors: Silvia Pfeiffer, John Foliot


Rationale

This is a proposal to address the need for programmatically linked video and audio transcripts in HTML5 through introduction of a @transcript attribute on HTML5 media elements that contains a URL (preferably an absolute URL).

It is based on an analysis of use cases for video transcripts as provided by the TranscriptElement CP. An agreement was reached to leave support for interactive transcripts (UC3) to HTML.Next. Further, UC4 can be solved in the way suggested by the transcript-IDREFs CP.

The remaining use cases are proposed to be solved by a combination of existing aria attributes and a new @transcript attribute for media elements (particularly for the embedding case).

After long discussions, this proposal (the "transcript URL proposal") and the "IDREFs proposal" are the only two remaining proposals that aim to introduce a change to support transcripts in HTML5. Their difference is only in A further "no change" proposal is also still on the table, aiming to provide more time to experiment with transcripts.

Summary

Issue 194 asks for a mechanism for associating a full transcript with an audio or video element.

In this CP we analyze the different use cases for transcripts and the need for programmatic linkage to the media elements. We then provide markup examples that solve the use cases, thus identifying the gap that remains to solve with new markup.

It is important to note that the "IDREFs change proposal" and this proposal address the same use cases and are based on the same research.

The use cases

UC1: A full text transcript is provided with the media resource in a separate but linked resource.

Examples:


UC2: A full text transcript is provided as text on the same page of the media resource.

Examples of non-timed transcripts published underneath the video on-page:


Further, these use cases need to satisfy the requirements listed in the transcript-IDREFs CP.

The Need of programmatic linkage between the video and its transcript

Before addressing the two mentioned use cases, it is important to understand what kind of programmatic linkage is desirable.

It is important to understand how transcripts are being published. In all analysed examples, videos are published in one screen area and transcripts are published in another screen area, typically below or beside the video. There is generally a defined screen area for the transcript, which is also where a choice between transcripts in different languages are made. The most usable ways in which transcripts are published are with the video above or beside it - in this way the user gets the choice to watch the video while reading the transcript or at least to seek to locations in the video while reading the transcript. The least usable video transcripts are plain text pages that are reached by linking away from the page with the video since the user loses all context.

The following list came about in discussions and is supported with examples from the Web.

What is not desirable:

  • rendering of the transcript inside the boundaries of the video player (e.g. as an alternative to the video or an overlay within the video boundaries) - there is not sufficient space and it removes the possibility to watch the video while reading the transcript.
  • having a button on the video player that links away from the video to a page that doesn't also contain the video - it removes the possibility to watch the video while reading the transcript.
Example: http://www.pm.gov.au/videos (poor experience when following the off-page link)
  • having a button on the video element that scrolls down to where the transcript is displayed - it takes the eyes off the video.

What is desirable/acceptable:

  • the transcript is rendered outside the video player, e.g. in a different region on the same page near the video player.
Example: http://www.state.gov/secretary/rm/2012/05/189592.htm
  • having a link underneath the video player that allows to download the transcript and read it in parallel to watching the video.
Example: http://www.pm.gov.au/videos (note the download link)
  • having a button underneath the video player that opens a section of the Web page that contains the transcript nearby.
Example: http://dotsub.com/view/ef3d7b6c-eab5-478a-a51c-d27166d27dcc
  • having a button on the video player that links to a page that contains the video itself with the transcript.
Example: almost all embeddable players link back to their original page where transcripts reside (e.g. YouTube, DotSub)
  • having a button on the video player that unhides a section on the page to reveal the transcript nearby. Note that this use case is typically identical to the button underneath the video player, since the video player in this instance is made up of more complex elements that just look like they are part of the video player.
Example: http://www.ted.com/talks/lang/mr/eli_pariser_beware_online_filter_bubbles.html

Analysis

Some trends are revealed in the above examples that influence the required linkage between video and transcript.

The majority of use cases shows a visible transcript on the same page as the video player.

This is the preferred publishing means of transcripts. It makes it easy for the user to consume the transcript together with the video. This works for all users, not just the sighted or not just users of AT.

There are some different ways of rendering included in this case:

  • text in plain view
 <video poster="poster.jpg" controls aria-label="video with transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <h4 id="transcript">Transcript</h4>
 <p>
   This is where the full transcript goes.
 </p>
  • text in plain view, but rendered from a separate html document
 <video poster="poster.jpg" controls aria-label="video with transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <h4 id="transcript">Transcript</h4>
 <iframe src="transcript.html">
 </iframe>
  • button toggles the text in/out with JS
 <video poster="poster.jpg" controls aria-label="video with transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <button id="transcript">Click to view transcript</button>
 <div id="unhide_on_click" hidden>
   <h4>Transcript</h4>
   <p>
     This is where the full transcript goes.
   </p>
 </div>
  • link to open the transcript in a separate window/tab
 <video poster="poster.jpg" controls aria-label="video with transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <a id="transcript" target="_blank" href="transcript.html">Transcript (HTML)</a>
  • link to download the transcript
 <video poster="poster.jpg" controls aria-label="video with transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <a id="transcript" href="transcript.doc">Download Transcript (doc)</a>


For the sighted user, there is no programmatic linkage required, since they can immediately discover whether a transcript is available or not and consume it adequately.

For the vision-impaired user, the screen reader should make an announcement that a transcript is available. This can be done using @aria-label and other accessibility-only attributes.

No new attribute is required to solve this use case.

The use case where a video has a transcript, but not on the same page: typically happens when embedding

When a publisher decides to publish a video transcript and a video, there is always a page that contains both. Sometimes the transcript is behind a (download) link, but there is always a Web page that connects the two. This is covered above.

However, when such videos can also be embedded on other sites (like YouTube videos or DotSub videos), the video player can end up being the only thing that moves to the new page, because the publisher wants to avoid the space requirements.

The DotSub player has a version where a richer HTML snippet than just the video player is embedded and this richer HTML snippet also embeds the transcript. However, there is a video-only embed player and that doesn't have a link or reference to the transcript. The YouTube player similarly doesn't have a link to the transcript.

Both the DotSub and the YouTube video player, however, have a link on their video player that links back to the video's home page, which contains (amongst other things) the transcript.

In order for transcripts to remain discoverable in this situation, we need a means to take the link to the transcript along with the video into the new site. This may or may not result in a visual representation of the link in the video controls. In either case, it is important that the transcript remains discoverable.

A new attribute is suggested to solve this use case: @transcript=URL .

 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="http://example.com/video.html">
   <source type="video/mp4" src="video.mp4">
   <source type="video/webm" src="video.webm">
 </video>


The URL should preferably be an absolute URL, such that it survives embedding. It should preferably link to a page that contains the video and the transcript together - linking to pages that only contain the transcript should be avoided.


Proposal

Putting the two use cases above together, we reach a common proposal for introduction of a @transcript attribute that contains a URL. This URL points to the transcript either on the same page using a fragment ID or on another page.

In combination with the @aria-label attribute, the availability of the transcript is made discoverable to AT users.


  • text in plain view
 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="http://this.domain/this.page#transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <h4 id="transcript">Transcript</h4>
 <p>
   This is where the full transcript goes.
 </p>
  • text in plain view, but rendered from a separate html document
 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="http://this.domain/this.page#transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <h4 id="transcript">Transcript</h4>
 <iframe src="transcript.html">
 </iframe>
  • button toggles the text in/out with JS
 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="http://this.domain/this.page#transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <button id="transcript">Click to view transcript</button>
 <div id="unhide_on_click" hidden>
   <h4>Transcript</h4>
   <p>
     This is where the full transcript goes.
   </p>
 </div>
  • link to open the transcript in a separate window/tab
 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="http://this.domain/this.page#transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <a id="transcript" target="_blank" href="transcript.html">Transcript (HTML)</a>
  • link to download the transcript
 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="http://this.domain/this.page#transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <a id="transcript" href="transcript.doc">Download Transcript (doc)</a>
  • "embedded" video with link to page that has transcript
 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="http://original.domain/original.page#transcript">
   <source type="video/mp4" src="video.mp4">
   <source type="video/webm" src="video.webm">
 </video>

This design fulfills the basic need for programmatic association of transcripts with media elements, and it's possible to link to same-document transcripts as well as external resources.

It is worth noting that when copying the video element in all use cases, the transcript URL remains as part of the video element and is not lost.

It is further worth noting that the transcript itself can be provided in or below any element, including <a>, <area>, <iframe>, <article>, any type of header element, or any other element (including a possible future <transcript> element that may contain interactive transcripts). The URL is able to point to any such element.

This technique is simple to author and easily extends on existing transcript publication patterns.


Rendering

The introduction of a new transcript attribute on media elements creates a transcript link that all users can use, since blind users and sighted users are affected equally.

Different options for rendering are:

  • Browsers can decide to include the URL in the controls bar rendered of a video or audio element when player controls are active. This could be a button or could be a menu entry called "Transcript" in a settings menu. Since it's a single URL with a clear semantic meaning ("transcript"), internationalization is simple for this one-word entry.
  • Web developers should similarly be encouraged to add a link (called "transcript") in any custom media element controls that they create.
  • Browsers can decide to include the URL in the context menu of the media element under a menu item of "Transcript".
  • Screen readers should announce the availability of the transcript URL and provide the user with a means to follow the link.

Preferably all these options will be made available.

Details

Note: The spec changes described below are intended to fully describe the sorts of changes necessary, but the exact form of the changes to be made are left to the discretion of the editor(s). (This is not a diff that can be blindly applied to the specification. Should the editor(s) find this description difficult to apply unambiguously, the author of this Change Proposal volunteers to work with them and the WG to resolve any such ambiguities identified.)

New section on transcripts

Add a section defining the @transcript attribute.

Transcripts for media elements may be provided, either directly in the text of the page, indirectly by linking to an external document with an <a> element, or by transclusion with an <iframe> element.

Where a transcript or transcript link is not visible on the page, but still available - e.g. in the case of embedded video - a @transcript attribute on the media element may be used.

The @transcript attribute may be specified to indicate a different Web location at which a transcript for the media element is available. This should preferably be a Web page that also contains the video for a better viewing experience.

If the attribute is specified, the attribute's value is a URL.

Modifications to the-video-element and the-audio-element

attribute DOMString transcript;

  • Add "transcript" to the list of common media element attributes below the IDL:

The media element attributes, src, preload, autoplay, mediagroup, loop, muted, controls, and transcript apply to all media elements.

The transcript content attribute on media elements gives the address of a Web resource that contains a text transcript for the media element. The attribute, if present, must contain a valid non-empty URL potentially surrounded by spaces.

The transcript IDL attribute on media elements must reflect the content attribute of the same name.

media . transcript
Returns the address of the resource containing the transcript.
Returns the empty string when there is no transcript address.

  • Further non-normative text should be added along the lines of:

The transcript content attribute should preferably contain an absolute URL to a Web page that contains both the media element and the transcript in plain text. This will ascertain that users can consume the transcript together with the media element's content. An absolute URL further reduces the risk of loosing the reference when embedding the media element's code.

This user interface should include features to begin playback, pause playback, seek to an arbitrary position in the content (if the content supports arbitrary seeking), change the volume, change the display of closed captions or embedded sign-language tracks, select different audio tracks or turn on audio descriptions, link to transcripts and show the media content in manners more suitable to the user (e.g. full-screen video or in an independent resizable window). Other controls may also be made available.

Even when the attribute is absent, however, user agents may provide controls to affect playback of the media resource (e.g. play, pause, seeking, and volume controls), and to link to a transcript but such features should not interfere with the page's normal rendering. For example, such features could be exposed in the media element's context menu.

Impact

Positive Effects

  • By programmatically associating transcripts with media elements, we enable users, both assistive technology users and otherwise, to more reliably be made aware of existing transcripts for media resources.
  • It's easy to update existing content to use this markup pattern, so it's easy for authors to adopt this technique.
  • It's possible to link to same-document transcripts as well as external resources.
  • Where multiple transcripts in different languages are available, the linked Web page should contain a means to select between the different transcripts and allow watching the video while reading one of the transcripts. This proposal builds on existing HTML features to make this possible.
  • For UAs that don't support the <video> or <audio> elements, transcripts should be linked to inside the media element. Thus this proposal makes use of existing fallback mechanisms.

Negative Effects

  • An additional attribute is added to media elements, which browser vendors have to support.

Conformance Class Changes

  • The @transcript attribute is allowed on <audio> and <video> elements.

Risks

  • UAs might not implement this mechanism, thus causing us to drop it from the specification in due course. However, since video and audio elements have controls and UAs are in the process of implementing menus to support captions, audio descriptions, and chapters, adding an additional menu entry may not be as objectionable. Since it's a single URL and a single menu entry compared, UA adoption can be simple. In comparison, the IDREFs proposal requires parsing of the elements that the IDREFs link to, making it much more complex to implement with no added value.
  • Authors might not adopt this mechanism. Since the technique is simple, this risk is low.

Obsoleting Change Proposals

This Change Proposal obsoletes the following CPs:

These CPs remain:

Original Bug:

  • Bug 12964 - <video>: Declarative linking of full-text transcripts to video and audio elements


Comparing to the IDREFs change proposal

In the (withdrawn) Introduction of a transcript element Change Proposal, ten requirements for transcripts were defined. As does the IDREFs proposal, we here examine the merits of each requirement and how the mechanism in the two remaining Change Proposals fare, since our views differ.

R1 Discoverability

This is a requirement that transcripts be both human-discoverable and machine-discoverable.

This proposal fulfills this requirement using a combination of the @aria-label and @transcript attributes. The @transcript attribute allows both sighted and non-sighted users to find the transcript following the link. Non-sighted users are additionally made aware of the existence of the link using ARIA. Similarly, machines find the transcript by following the URL. Existing user agents are no worse off with this proposal. In addition, discoverability is not lost when the video element markup is embedded on other pages.

The IDREFs proposal provides for similar discoverability for machines, which can follow an IDREF to an element on the same page. In fact, discoverability is of comparable quality between the two proposals when the transcript is provided on the same page. While the IDREFs proposal does not speak about how to announce the availability of the transcript to screen readers, it can be expected that it either also relies on an ARIA attribute, or it requires screen readers to be extended to interpret the @transcript attribute to announce its availability. This is therefore an identical feature.

However, when the transcript is provided on a different page to the page that the video is published on, the IDREFs proposal becomes a lot more fragile.

Firstly, since it relies on the transcript being provided on the same page, an off-page transcript needs to be provided through a separate element (in the examples it's an <a> element) that has to be provided somewhere on the page (and may need to be hidden if the page author does not want it shown). Thus, getting to the transcript requires resolution of a double indirection, something that the browsers will need to check for.

Secondly, if the video element gets embedded into another page, discoverability may suffer with the IDREFs proposal because the IDREF points to another element on the page. The only means around this is to put the <a> element with the indirection to the transcript inside the video element. Then it won't get lost when embedding.

Putting all these together for the IDREFs proposal makes for this implementation:

Here is the page on which the video element is published with the transcript below:

 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="foo">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
  <a href="http://this.domain/this.page#transcript" hreflang=en id=foo>English Transcript</a>
 </video>
 <h4 id="transcript">Transcript</h4>
 <p>
   This is where the full transcript goes.
 </p>

Then this will survive embedding in another page:

 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="transcript-ref">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
  <a href="http://this.domain/this.page#transcript" hreflang=en id=foo>English Transcript</a>
 </video>

This markup expects web developers to understand the double indirection and to provide anchors inside the video element that provide a URL to the actual transcript. The transcript-URL does not have such a double indirection and uses the URL directly on the video element.

So, in comparison the transcript-URL proposal has this markup:

 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="http://this.domain/this.page#transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <h4 id="transcript">Transcript</h4>
 <p>
   This is where the full transcript goes.
 </p>

When embedded in another page, it survives:

 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="http://this.domain/this.page#transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>

In addition, it poses no restriction on what markup is provided for fallback to older clients. It could include the transcript URL inside the video element if that is what the author would like, but it does not have to.

Summary: both proposals fulfill R1, but the IDREFs proposal is more fragile and has a double indirection that Web developers have to understand and browsers have to interpret to be able to provide discoverability of the transcript on the video element. Thus, the URL proposal provides a preferable solution.


R2 Choice to consume

This requires that users have the ability to control whether or not they consume a transcript. Both proposals fulfill this requirement.


R3 Rich text transcripts

This is a requirement that transcripts may be expressed in various rich text formats (such as HTML), and not just in plain text. Both proposals fulfill this requirement.


R4 Design aesthetics

This has two sub-requirements: one, that how transcripts are displayed be styleable by authors, and two, that it must be possible to expose transcripts in custom video controls. Both proposals fulfill each of these requirement.


R5 Embeddable

This requires that it be possible for transcripts to be expressed as an external document, while also embedded into the document which contains the media element. Both proposals fulfill this requirement (with <iframe>).

A further aspect of this requirement, which has not fully been expressed, is that the transcript link not be lost when the video is embedded elsewhere. This has been addressed in R1. It is noteworthy that the URL transcript proposal provides for a simple solution here without a double indirection, while the IDREFs proposal requires special markup inside the video element and a double indirection to allow embedding without loss of linkage. Or simply said: if an IDREF points to a section on the page that is not part of the video element, it will end up as a dangling reference (or worse: potentially pointing to something else on the embedding page) and the browser has no means to determine that this is wrong.


R6 Fullscreen support

This requires that it be possible for transcripts to "go fullscreen with the media element." The meaning of this is that when the video element goes fullscreen, the transcript should not be lost. Since both requirements allow the reference to the transcript to be included in the video's controls or a context menu and these do not get lost when going fullscreen, both proposals fulfill this requirement.


R7 Retrofitting

Existing pages publish transcripts for media elements through a visible transcript on the same page as the media player. Often times this transcript is lost when the video is embedded in a different page (for example this is the case for all YouTube videos). Thus, when we retrofit existing publications, we want to make sure to both, continue to support the publication of transcripts on the same page, but also enable to retain the link to the transcript when embedding. These requirements were a key motivation of this Change Proposal.

Like the IDREFs proposal, this proposal does not restrict any publication means of transcripts on the same page. This continues to work in UAs that do not support the <video> element or those that support it, but not the @transcript attribute. Existing author behaviour is not inflicted in either case.

However, the IDREFs proposal falls down when videos are embedded, but transcripts are not, which is a very common use case, too. The IDREFs proposal does not provide a good solution to this use case, but focuses specifically on pages where the transcripts are published on the same page.

The IDREFs proposal suggests that this proposal is fragile because Web developers may put into the URL just the relative link to the element on page that contains the transcript, thus losing it when embedding. This is a possibility for the URL proposal, which needs to be overcome with education, examples, and with validators that point out problems when relative URLs are used in the @transcript attribute. However, it's only a risk, whereas in the IDREFs proposal it is a certainly, since all links are relative. Thus, the IDREFs proposal is much more prone to this problem than this URL proposal.

Summary: the URL proposal fulfills R7 better than the URL proposal, since it does not fail easily for the classic use case of embedding, which is the main need and the main advantage over existing transcript publication methods.


R8 No link duplication

This requirement asks that transcript links are not duplicated. However, there are a multitude of use cases of transcripts and what this requirement asks for is to not duplicate the URL on the same page.

To compare the URL and the IDREFs proposals properly, one has to start with the case where a video is published with a transcript on the same page and made embeddable without losing the transcript link. Thus, we need to compare the IDREFs proposal example that has the link in an anchor inside the video element to the transcript-URL proposal:

 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="foo">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
  <a href="http://this.domain/transcript.html" hreflang=en id=foo>English Transcript</a>
 </video>
 <h4 id="transcript">Transcript</h4>
 <iframe src="transcript.html">
 </iframe>

This contains a duplication of links. Thus, a better way to publish it would be:

 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="foo">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
  <a href="http://this.domain/this.page#transcript" hreflang=en id=foo>English Transcript</a>
 </video>
 <h4 id="transcript">Transcript</h4>
 <iframe src="transcript.html">
 </iframe>


Comparing this to the transcript-URL proposal looks like this:

 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="http://this.domain/this.page#transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <h4 id="transcript">Transcript</h4>
 <iframe src="transcript.html">
 </iframe>

It could also be published like this:

 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="http://this.domain/transcript.html">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <h4 id="transcript">Transcript</h4>
 <iframe src="transcript.html">
 </iframe>

but that would make the transcript less usable when embedded because it links back to just the page with the transcript instead of the video and the transcript. This is therefore to be avoided.

In either case, the IDREFs proposal has to deal with the same question as the transcript-URL proposal: what URL is included in the anchor element inside the video element. Since the transcript-URL proposal always uses the same suggestion: namely link to the current page with an offset to the element with the transcript, it is less prone to URL duplication.

Summary: the transcript-URL proposal fulfills R8 better than the IDREFs proposal.


R9 Multiple transcripts

Transcripts may be available in several languages - this indeed happens in current transcript publication sites, including YouTube and TED. TED and YouTube have found two different means of dealing with this situation.

YouTube changes the language of the displayed transcript when the language of the captions that are displayed is changed. This makes sense for YouTube, because the transcripts are created from the captions and are just a different means of displaying the captions. This is, however, not a general requirement - in fact, it is preferable for transcripts to also contain scene descriptions and not just the spoken transcription, because that allows users to read the transcript without having to watch the video (a particularly important use case for deaf-blind users).

TED has found a different solution to this problem: where the transcript is displayed, there is also a menu that allows the choice of language that is displayed. This makes the transcript one semantic entity on a page and independent of the video element. It is therefore a preferable means of dealing with transcripts.

In the transcript-URL proposal, there is no restriction about the language in which the transcript is published - it continues to allow Web sites to publish their transcripts in multiple languages and to provide a menu to change the language in which it is published. It simply provides programmatic linkage between the video element and the transcript - any specification of language has to be provided by the elements through which the transcript are published.

This is essentially the same with the IDREFs proposal. It also does not allow for specification of language in the @transcript attribute. It does, however, allow for specification of multiple IDREFs, which could all point at elements with different languages (they don't have to, but they could). It could even support creation of a menu with transcripts of different languages on the video element, but it would require to parse the elements that the IDREFs point to for getting this information. Thus, this is possible:

 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="foo1 foo2 foo3">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
  <a href="http://this.domain/this.page#transcript_en" hreflang=en id=foo1>English Transcript</a>
  <a href="http://this.domain/this.page#transcript_de" hreflang=en id=foo2>German Transcript</a>
  <a href="http://this.domain/this.page#transcript_fr" hreflang=en id=foo3>French Transcript</a>
 </video>
 <h4 id="transcript">Transcript</h4>
 <p id="transcript_en lang="en">This is the actual transcript.</p>
 <p id="transcript_de lang="de" hidden>Dies ist das Transkript.</p>
 <p id="transcript_fr lang="fr" hidden>Ceci est le transcripte.</p>
 <button>Display English transcript</button>
 <button>Display German transcript</button>
 <button>Display French transcript</button>

However, the difference is only that the choice of language happens in the video element rather than the linked to page. Thus, when you have linked to the page and want to change the language thereafter, you still need the choice on the page. This is a duplication of the selection mechanism. The key issue here is that the transcript is not handled as its own semantic content in its own right, but rather regarded as a dependent content from the video.

Also note that the IDREFs proposal requires duplication of the language markup both on the link through which the transcript menu is created as well as on the transcript elements themselves, which is another consequence of the double indirection.

In contrast, the transcript-URL attribute regards the transcript as a semantically rich piece of content in its own right that only requires a linkage to the video element. Thus, the above example becomes:

 <video poster="poster.jpg" controls aria-label="video with transcript" transcript="http://this.domain/this.page#transcript">
  <source type="video/mp4" src="video.mp4">
  <source type="video/webm" src="video.webm">
 </video>
 <h4 id="transcript">Transcript</h4>
 <p id="transcript_en lang="en">This is the actual transcript.</p>
 <p id="transcript_de lang="de" hidden>Dies ist das Transkript.</p>
 <p id="transcript_fr lang="fr" hidden>Ceci est le transcripte.</p>
 <button>Display English transcript</button>
 <button>Display German transcript</button>
 <button>Display French transcript</button>

Also note that as a consequence of the single link on the video element, there is no need for provisioning of a descriptive text - it is simply always the word "transcript" (internationalized to the browser's default language) that links to the transcript. No further metadata is required, since the target transcript publication page will provide for all the required metadata and markup.

Finally, it has been mentioned that this means that transcripts are handled differently to captions. This is actually intended and a good thing. Captions are a part of video and rendered on top of them, while transcripts are a piece of content that can stand alone in their own right. Thus they have to be handled as a separate concept and the language choice should be made where the transcript is, and not where the video is.

Summary: Both proposals support the publication of multiple transcripts in different languages, but the transcript-URL proposal works more directly with existing means of publication of transcripts and requires less new markup to do so and therefore instills less parsing requirements on the UA. The IDREFs proposal conflates the concept of "linking a transcript (area) to the video" and the concept of "choosing between different transcripts" unnecessarily. Thus, the transcript-URL proposal fulfills R9 better than the IDREFs proposal.

R10 Stand alone transcripts

It is possible for UAs to render stand-alone transcript documents which are not programmatically associated with media elements in the same way for both proposals. In both proposals, if the @transcript attribute is not available, there is no means to link programmatically between the video element and a transcript (that is potentially on another page). In both proposals, if the transcript is published on the same page as the video, it is available to all users, no matter whether their browser supports the @video element or not. The transcript-URL proposal further suggests the use of @aria-label to actually make AT users aware of the availability of the transcript, but this is a means that is also available to the IDREFs proposal. So, both proposals satisfy this requirement in the same way.

Conclusion

The IDREFs proposal is a more fragile solution to the requirement of programmatically associating transcripts with video elements. To survive cross-document copy and paste operations, it requires that extra markup be placed inside the video element the contain the link that the transcript-URL proposal directly places into the @transcript attribute. To add menu elements inside the video controls that allow users to link directly to the transcript, it requires the browser to parse those elements inside the <video> element that point to the transcripts themselves, thus requiring a double indirection be resolved by the UA. This latter is particularly fragile since it requires the Web Developer to create markup inside the video element that is used both as fallback content and as a menu for the video element, thus overloading an area that has one use with a second use case.

One example provided in the IDREFs proposal provides a good idea of the challenges that UAs have to go through to add menu elements to the video element:

 <video src=video.mp4 transcript="foo bar">
   <div id=foo lang=en>English transcript goes here</div>
   <p>A <a href=bar.html hreflang=de id=bar>German language transcript</a> is available as well.
 </video>

In this example, the UA would need to parse the text inside the video element to come up with a menu that reads:

  • English transcript (with a link of: "#foo" and a hreflang="en")
  • German language transcript (with a link of "bar.html" and a hreflang="de")

While possible, resolution of such double indirection is not something we've had elsewhere before in HTML and is particularly dependent on good authoring practice. (Note, e.g. that the bar.html link will not survive copy-and-paste to a different domain. Also note that if the English transcript is meant to be published on the same page as the video is, too, there will need to be duplication of the div).

In essence, the IDREFs proposal optimizes on a single use case: the one where the transcript is published on the same page as the video. This use case does not even require a solution, since the transcript is on the same page as the video and therefore discoverable. For the second use case - the one where the transcript is published on another page - the IDREFs proposal requires a double indirection, requiring to link to another element that contains the off-page link. Thus, the transcript-URL proposal is an optimisation of the IDREFs proposal.

Summary: The transcript-URL proposal fulfills our requirements better than the IDREFs proposal.