The Hypertext CG plan to hold an informal gathering or two on the subject of Accessibility of Media Elements in HTML 5. The media elements are audio and video, and their supporting elements such as source.
The current specification of Timed Media elements HTML5 takes a fairly hard-nosed approach to what is presented as timed media: it is inside the timed media files that are selected from the sources. There is currently no provision for linking or synchronizing other material, and there is no discussion of how to manage the media so it's accessible. This needs addressing.
We would like to understand the 'landscape' and put in place good architectural support in general, as well as making sure that specific solutions exist to the more pressing problems. We anticipate working, in public, to develop proposals for any changes to specifications that might be suggested by the work, and also to develop a cohesive 'best practices' document that shows how those provisions can be used, by authors, by user agents (browsers), and users, to address the issues we identify.
We are aware that good accessibility rests on four legs (at least):
It's easy to fail on one of these, and good accessibility is not then achieved.
Accessibility provisions for Timed Media might themselves be timed (e.g. captions) or un-timed (e.g. a readable screen-play or transcript). We wish to consider both categories.
The questions we would like to address include, but are not limited to the following:
We are all aware of captioning for those who cannot hear the audio; less common is audio description of video, for those who cannot see. The BBC recently had some content that had optional sign-language overlays. Issues can also arise with susceptibility (e.g. flashing videos and epilepsy, color vision issues, and so on).
We are all aware of the existence, for example, of screen readers and perhaps even Braille output devices. We've seen tags in other parts of HTML that are there to support accessibility, and frameworks such as ARIA. Are there existing good practices that naturally extend to Timed Media?
There have been ongoing debates about whether 'unique' provision for accessibility (functions with no other purpose) are desirable. We do not intend to have this philosophical debate, but it would be useful to hear of related problems and opportunities that help make the debate irrelevant. For example, the provision of a transcript or separately accessible captions, in text form, makes indexing and searching content much easier. Are there problems like this that we can address that will make it more likely that authors build accessible timed media?
Much of the work and research in this area has been done for isolated, analog, systems (classic television). Instead, we have a digital content presented in a rich context (web content). What new opportunities and solutions are opened up by this?
The work of the W3C on a common Timed Text format, and the existence of general frameworks such as ARIA (Accessible Rich Internet Applications), suggest that there are pieces of the solution space we should consider. What are they?
We are aware that there are a number of pioneering organizations in this area. The BBC's work with sign-language has already been noted; workflows for captioning content have been developed in a number of places. There have been script-based experiments on captioning. What are some of these systems and experiments, and what can we learn from them?
We think that at least the following communities and groups might be affected:
If you feel prepared to attend, present, and work cooperatively on the problem outlined in the Scope section, please respond to the questionnaire as soon as possible. There is no registration fee, but registration is required. W3C membership is not required in order to participate in the gathering.
To attend the gathering, you must come prepared to present on one of the questions in this document, or a suitable other question, drawing from your experience or expertise to help inform the discussion and make progress on proposing solutions.
We expect the gathering to spend perhaps two-thirds of the time on these presentations, with short Q&A for each. Then we may have a panel session or two, or moderated discussion, to address focused questions. As stated in the introduction, we are looking for a framework and solutions with good 'longevity', simplicity, and efficacy, that will be embraced by the standards community, content authors, user agent developers, and end users. This is ambitious but achievable, we believe, and opportunities such as this to 'get it right from the start' come up all too rarely.
This gathering is done under the auspices of the HyperText Coordination Group.
Philippe Le Hégaret, W3C
This informal gathering will last one day, and the first one will be held in the Bay Area on November 1st at Stanford University. The meeting place is 20 minutes away from the TPAC 2009 hotel (see directions).
Stanford University
Tresidder Union building, 2nd floor
459 Lagunita Dr,
Palo Alto, CA
There are numerous way-finding signs on campus for both Tressider Union, and the Faculty Club, which is next door. Free weekend parking is available in the lot across from Tressider Union.