From HTML WG Wiki
Mint a transcript attribute for the programmatic association of transcripts with media elements
In order to programmatically associate media elements with transcripts, we should use a
transcript="" attribute which may take zero or more IDREFs to elements elsewhere in the document.
This is for ISSUE-194 (full-transcript).
There are two proposals for adding a media transcript feature to HTML: this proposal (henceforth "our proposal", "this proposal", or "the IDREF proposal") and the Introduction of a @transcript=URL attribute proposal (henceforth "the URL proposal"). As these proposals are quite similar (and are based on the same research) I have separated out rationale which is equally applicable to both proposals, and have separately provided rationale for the differences that remain between the two proposals.
Rationale common to both proposals
When transcripts of media files are available, they are useful to all users. Users of Assistive Technology (AT) obviously benefit from transcripts, but transcripts are also useful to other users.
Consider a video of a college lecture. Students can save time by reading the transcript instead of watching the video. It's also much easier for students to search or to skim for specific content in the transcript than to do so with the media file itself. Given this, it is important for transcript links to be readily exposed to all users.
Transcripts need to be programattically associated with media elements in order for a UA to expose the presence of the transcript in its media controls, in a context menu, or in some other way, and also so that AT can expose the transcript to its users.
Both proposals aim to address two use cases:
UC1 linked transcripts: full text transcripts are provided with the media resource in separate but linked resources.
UC2 same-document transcripts: full text transcripts are provided as text on the same page of the media resource.
Examples of transcripts published underneath the video on-page:
Design of a transcript attribute which takes multiple IDREFs
We can associate the media element with visible transcripts (or links to them) somewhere else in the document. To do this, we would add a
transcript="" attribute to the media elements which would take a space-separated set of IDREFs. For each such IDREF, if the ID is that of an
<iframe> element, the document pointed to by the
href="" (in the
<area> cases) or
src="" (in the
<iframe> case) attribute is taken to be the transcript of the media. If the element with the given ID is not an
<iframe> element, the element itself is taken to be the media's transcript.
<video src=video.mp4 transcript="foo bar"></video> <p>Transcripts are available in <a href=foo.html hreflang=en id=foo>English</a> and <a href=bar.html hreflang=de id=bar>German</a>.
or, in the same-document case,
<video src=video.mp4 transcript="foo"></video> <div id=foo>Transcript goes here</div>
This design fulfills the basic need for programattic association of transcripts with media elements, and it's possible to link to same-document transcripts as well as external resources.
This technique is fairly straightforward to author; it is no harder than the existing
<input id> pattern. This technique closely matches existing content which contains transcript links, so it's exceptionally easy to update existing content which publishes transcripts to use this markup pattern.
The simplest way to ensure that the transcript link is readily exposed to all users (including users of older UAs and ATs) is to encourage or even mandate that authors include this link directly in the visible text of the document, or directly as (part of) the constituent text of the document. Relying on UAs to expose transcript links in a context menu could be problematic on touch devices (which lack context menus). Relying on UAs to expose such links in their default video controls means that users suffer when Web site authors use custom video controls and fail to expose the transcript in their custom controls.
Comparing the two proposals
In the (withdrawn) Introduction of a transcript element Change Proposal, ten requirements for transcripts were defined. Let's examine the merits of each requirement and how the mechanism in the two remaining Change Proposals fare.
This is a requirement that transcripts be both human-discoverable and machine-discoverable. Our mechanism fulfills this requirement. In the URL proposal, authors may use a direct URL in the
transcript="" attribute. If authors do this, sighted users will not be able to discover the transcript in several circumstances:
- in existing User Agents (which do not implement a transcript mechanism),
- in future User Agents which expose
transcript=""to AT but do not provide transcript access in their default media controls,
- and in future User Agents which expose
transcript=""to AT and expose transcript access in their default media controls, on sites which use custom media controls that do not provide transcript access.
Summary: the IDREF proposal fulfills R1 better than the URL proposal.
R2 Choice to consume
This requires that users have the ability to control whether or not they consume a transcript. Both proposals fulfill this requirement.
R3 Rich text transcripts
This is a requirement that transcripts may be expressed in various rich text formats (such as HTML), and not just in plain text. Both proposals fulfill this requirement.
R4 Design aesthetics
This has two sub-requirements: one, that how transcripts are displayed be styleable by authors, and two, that it must be possible to expose transcripts in custom video controls.
Our proposal encourages transcript links to be visible; authors are familiar with their ability to style visible page content.
The URL proposal encourages transcript links to be directly present in the
content property, and
::after pseudo-element), it is far more difficult to do so.
Both proposals make it possible to expose transcripts within custom video controls. However, in the URL proposal, authors are encouraged to directly include a link in the
transcript="" attribute. Bare URLs lack descriptive titles and language metadata; custom controls exposing such transcripts would lack any way for the user to know which one to choose (see also R9).
Summary: the IDREF proposal fulfills R1 better than the URL proposal.
This requires that it be possible for transcripts to be expressed as an external document, while also embedded into the document which contains the media element. Both proposals fulfill this requirement (with
R6 Fullscreen support
This requires that it be possible for transcripts to "go fullscreen with the media element." To the extent that I can make sense of this requirement, both mechanisms fulfill it. That is, there is nothing in either mechanism that forbids or prevents this.
As noted in the URL proposal, the vast majority of existing pages which publish transcripts for media elements show a visible transcript on the same page as the media player. It should be as easy as possible to alter such pages to programmatically associate the visible transcript with the media element. This requirement was a key motivation of the design of the mechanism advocated in this Change Proposal.
Our mechanism readily exposes transcript links to users, which helps it work well in UAs that do not support the
<video> element, and also in UAs that support
<video> but not
The URL proposal less closely matches existing author behavior, so increases the authorial effort required to retrofit existing pages. While it's possible in the URL proposal to link to in-page transcripts, because absolute URLs are allowed in its
transcript="" attribute, authors are much more likely to simply directly link to the transcript, thus either duplicating the link (see R8) or failing to provide the transcript to users of older browsers (see R1).
Summary: the IDREF proposal fulfills R7 better than the URL proposal.
R8 No link duplication
Our mechanism fulfills this requirement. In fact, this requirement was a key motivation of the design of the mechanism advocated in this Change Proposal.
In the URL proposal, authors who wish to provide a visible transcript link will most likely duplicate the link. We should avoid duplicating the link to the transcript, because such duplicated data tends to bit-rot, thus harming accessibility. [Çelik, Doctorow]
Summary: the IDREF proposal fulfills R8 better than the URL proposal.
R9 Multiple transcripts
Transcripts may be available in several languages; the mechanism we come up with should straightforwardly allow authors to link to multiple transcripts. Our mechanism fulfills this requirement. In fact, this requirement was a key motivation of the design of the mechanism advocated in this Change Proposal.
You can link to many transcripts, and you can use the existing
hreflang="" attribute to hint to the UA about the language that each transcript is in. Because the association is from the media element to the transcript elements, it's especially easy for UAs to find all of the media element's transcripts (without having to process the entire DOM).
The URL proposal does not allow for the provisioning of multiple transcripts. Even if it were altered to do so, its mechansim fails to provide language, title, or other such metadata for each transcript. This harms the user's ability to choose the correct transcript, and the UA or page author's ability to expose multiple transcripts in custom or built-in video controls.
Summary: the IDREF proposal fulfills R9; The URL proposal does not.
R10 Stand alone transcripts
Our mechanism fulfills this requirement. Which is to say, it is possible for UAs to render transcript documents which are not programmatically associated with media elements.
In the URL proposal, authors may use a direct URL in the
transcript="" attribute. If authors do this, the transcript link will not be available in browsers that do not support or do not render audio or video elements.
Summary: the IDREF proposal fulfills R10 better than the URL proposal.
Surviving cross-document copy-and-paste operations
The programmatic association of the
<video> with its transcripts might not be maintained through a cross-document copy-and-paste operation, though this is primarily a function of the distance in the DOM between the media element and the element representing the transcript, and not the actual form of programmatic association.
To completely avoid the copy/paste problem, the elements pointed to by
transcript="" could be contained within the media element's subtree. Such content will be displayed in browsers which do not support HTML5 media elements, thus serving users of such browsers.
<video src=video.mp4 transcript="foo bar"> <object …> <-- Fallback player for browsers that don't support media elements --> </object> <div id=foo lang=en>English transcript goes here</div> <p>A <a href=bar.html hreflang=de id=bar>German language transcript</a> is available as well. </video>
Sites which provide for the embedding of media often proivde a
<textarea> for easily copying their embed markup. Such sites can include markup using whatever mechanism we decide on, thus reducing the impact of the copy-paste problem even further.
Summary: the URL proposal handles the copy-paste scenario somewhat better than the IDREF proposal, but this is not a serious problem in practice.
The IDREF design fulfills our requirements better than the URL proposal.
N.B. The spec changes described below are intended to fully describe the sorts of changes necessary, but the exact form of the changes to be made are left to the discretion of the editor(s). (This is not a diff that can be blindly applied to the specification. Should the editor(s) find this description difficult to apply unambiguously, the author of this Change Proposal volunteers to work with them and the WG to resolve any such ambiguities identified.)
New section on transcripts
Add a section defining the transcript="" attribute.
Transcripts for media elements may be provided, either directly in the text of the page, indirectly by linking to an external document with an
<area> element, or by transclusion with an
<iframe> element. To programmatically associate such a transcript with a media element, a
transcript="" attribute on the media element may be used.
The media element can be associated with zero or more transcripts, known as the media element's transcripts, by using the transcript attribute.
Except where otherwise specified by the following rules, a media element has no transcript.
The transcript attribute may be specified to indicate a transcript with which the media element is to be associated. If the attribute is specified, the attribute's value, when split on spaces, must be a list of IDs of elements in the same Document as the media element. If the attribute is specified and there is an element in the Document whose ID is equal to one of the entries in the transcript attribute, then that element is one of the media element's transcripts.
Modifications to the-video-element and the-audio-element
#the-video-element, update the note beginning with the sentence "In particular, this content is not intended to address accessibility concerns". Specifically, change the sentence
For users who would rather not use a media element at all, transcripts or other textual alternatives can be provided by simply linking to them in the prose near the
to reference this new mechanism.
Make a similar edit to the same note in
- By programattically associating transcripts with media elements, we enable users, both assistive technology users and otherwise, to more easily access transcripts.
- It's easy to update existing content to use this markup pattern, so it's easy for authors to adopt this technique.
- We avoid duplicating the link to the transcript, thus preventing the link presented to AT users to fall out-of-sync with the link presented to others.
- You can link to many transcripts, and you can use the existing
hreflang=""attribute to hint to the UA about the language that each transcript is in.
- It's possible to link to same-document transcripts as well as external resources.
- It degrades well in UAs that don't support the
<audio>elements, as well as in UAs that support
<audio>, but have not yet been updated to support programmatically associated transcripts.
- It's more difficult to programmatically associate a transcript link than it is to simply include the link in prose near a media element. Therefore it's reasonable to expect content authors to not bother with the programmatic association. (This is true for all methods of programmatically associating a transcript with a media element.)
Conformance Classes Changes
transcript=""attribute is allowed on
- UAs might not implement this mechanism, thus causing us to drop it from the specification in due course.
- Authors might not adopt this mechanism.
Local Variables: mode: text mode: longlines End: