SMIL Accessibility -- My impressions by George Kerscher 2/25/98 W3C working group meetings are confidential and it is important to protect the intellectual property and the software development of participating W3C members. However, it is important to report on developments to other related W3C and other working groups. For this reason this report lacks details that would infringe on the SMIL working group's rights. At the same time, I intend to provide sufficient detail for discussions within the WAI and other groups. DAISY CONSORTIUM The DAISY Consortium is moving forward with SMIL implementations. There are many implementations under development that relate to the "Digital Talking Book" application. Developments by the Japanese Society for the Rehabilitation of persons with Disabilities (JSRD), at Recording For the Blind & Dyslexic (RFB&D), and at least three vendors to the DAISY Consortium are all working on SMIL implementations in the authoring and the playback components of SMIL. At the recent inter operability testing, the DTB was tested and reviewed. There are no major outstanding issues with this application of SMIL. GENERAL ACCESSIBILITY OF SMIL Many people are concerned about the accessibility of multimedia presentations. The Accessibility issues at the recent W3C SMIL inter operability meeting were discussed for a long time. These demonstrations and discussions resulted in some proposed changes to the specification. While it was rather late to introduce changes, it was felt that it was important to introduce some mechanisms for accessibility purposes. After more discussions, we realized that these enhancements actually benefited everybody. I demonstrated my adaptive equipment to the SMIL team that was present. I focused on explanations of 1) keyboard control and general navigation, and 2) alternatives to various media types that are not inherently accessible. The demonstration was primarily an awareness raising session. The group was interested and asked many questions. Some of the members had previous experience with captioning, but the issues surrounding vision related disabilities was new. After discussions, some new constructs were introduced and two people were asked to prepare a proposal to introduce the mechanisms that will better support accessibility. These recommendations were introduced in the following weeks, and four other attributes were introduced to aid in accessibility. I will not go into details here, but I can say that it seems clear how sophisticated multimedia presentations can be made accessible. For the first time it became clear to me how this would work. ACCESSIBLE MULTIMEDIA PRESENTATION NOTE: I would like to clarify the difference between a multimedia presentation and general software applications that use multimedia. SMIL presentations use multimedia components to deliver prepared presentations. This is NOT the same as an application that uses sound and graphics. SMIL has no control over what a application might do. SMIL schedules and delivers media types using open specifications. In other words don't expect video games to fall into the category of a SMIL presentation. Media types fall into several categories: video, audio, spoken audio, text, images, and animation. Under the newly proposed enhancements, it will be possible to specify that the media types fall into a group. The user would be able to select to enable and disable certain characteristics within each group. For example, in the spoken audio group, a person with a hearing disability could select to turn on captioning. The specification already provides for synchronization down to 1/30 of a second, and with the introduction of this enabling switch, the user could turn on captioning for this media type. Of course, the author would need to provide this textual information in the preparation of the presentation. Video and animations are best made accessible to persons who are blind through video descriptions. In the video and and the animation media type the option to turn on descriptions to these media types would enable all videos and animations in the SMIL presentation to be described as the default mode in the system for that person. The media type then inherits this characteristic from the group. We are defining a mechanism to set user preference settings that accompany multimedia presentations. A temporal media types such as images and text require an additional behavior. The specification would provide for alt text and long descriptions of images and other data types. When this media type turns on the alternative setting in this group, the alternative text is exposed either through human narration or through standard text. The different characteristic here is that the system must pause while the information in the a temporal element can be reviewed by the end user. EXAMPLE It may be instructive to explain how I envision an accessible multi media presentation using SMIL. Imagine the screen divided into four essential parts: upper left quarter, upper right quarter, lower right quarter, and lower left quarter. The user first sets the preferences of the SMIL player to display captioning and to enable it for non-visual use. The controls all have keyboard equivalence. Controls for pausing, resume, selection within the media object, and changing the focus (active point) are all available through standard keyboard conventions for the operating system the player is implemented under. The SMIL presentation starts with a news logo and their audio icon played in the upper right quarter. This goes to black and a video of a newscaster standing in front of a world map begins. The captioning, below the video, echo in text the newscaster's words and this is synchronized with his script. The video description simply briefly describes the map behind the newscasters desk. In the lower left another video opens of another newscaster who begins a discussion with the person in the other video. The multimedia presentation suggests that the two people are in different locations in the world. This video inherited the same characteristics as the other video, so it has captioning and video descriptions enabled. As they talk a map (image) of the region in question is displayed in the upper left. At this point the videos halt. This action happens automatically, because the alternative to the images were specified. The human narration long description is played and when this ends, the presentation continues. (note: it would be possible to have the map's description in text and the person could read this with a screen reader or their dynamic Braille device.) In the lower right corner images of buildings noted on the map are displayed and as each one of these are placed on the screen, the system pauses to examine the alt text or the long description of the image. Next, the map in the upper left animates to show change over time. The video description explains the information being animated and changed. It may be necessary for the video description to extend longer that the animation. In this case the last frame of the animation freezes while the description continues. When the description is over, the presentation continues on. Finally all goes to black with the news logo and their audio icon is played and the presentation ends. NOTE: It is probable that a presentation with accessibility components enabled will take longer than if they were not activated. This shows some of the power of a SMIL presentation. The presentation takes on different characteristics depending if certain features are turned on. AUTHORING It is clear from the description above that the authors will need to create the content that is associated the various media types. All the synchronization is specified in the SMIL presentation. The grouping and the attributes are all specified in the SMIL notation. It is up to the authors to provide the needed information. Authoring tools It is essential that authoring tools make it easy for content creators to produce this type of information. Players Players need to be created with accessibility in mind. Keyboard equivalence and navigation alternatives must be considered. George Kerscher, Project Manager PM to the DAISY Consortium Recording For the Blind & Dyslexic Email: kerscher@montana.com Phone: 406/549-4687