Important note: This Wiki page is edited by participants of the RDWG. It does not necessarily represent consensus and it may have incorrect information or information that is not supported by other Working Group participants, WAI, or W3C. It may also have some very useful information.

Smart Images

From Research and Development Working Group Wiki
Jump to: navigation, search

Images on the Web are evolving from static JPEG and GIF files to structured formats such as SVG and dynamic images produced using the canvas element. The potential exists for descriptive information about the image to be closely coupled, and perhaps, contained within the image format itself. The concept of Smart Images is put forward as an approach where images directly contain sufficient information, exposed through a standard interface, to allow access in a user's preferred modality. Smart Images may incorporate structured text, sonification, and haptic information which can be used to describe and facilitate exploration and understanding via available technologies (e.g. screen readers, tactile embossers, devices incorporating haptic feedback, or 3D printers).


Page author(s): Markku T. Hakkinen, Educational Testing Service


Image accessibility, alternate formats, image description, sonification, haptics, structured images, interactive images, simulations


Images have posed challenges to users with disabilities from the very beginning of the Web. Though the W3C Web Content Accessibility Guidelines are clear in defining that images with informational content must provide alternate text (via the alt attribute), there remains uncertainty over how to provide longer descriptions, a key requirement for accessibility of complex images and diagrams. With the growing utilization of SVG and HTML5 Canvas for creating images on the Web, the question arises whether the alt attribute or the proposed long description alternatives are adequate for providing a mechanism image description. SVG, for example, provides a framework for incorporating accessibility information [1] [2] and examination of how to incorporate WAI-ARIA in SVG is underway [3]. A W3C Community Group, Accessible Infographics, has been formed [4] and external to W3C, the DIAGRAM Center [5] is a project examining several aspects of image description, including research and standards, and is defining a content model for image descriptions. Researchers in academia and industry are exploring a variety of approaches to providing accessibility to images, ranging from sonification to virtual tactile/haptic interactions, with potential application ranging from the near term to several years out.

Smart Images, as defined in this proposal, would comprise images that are structured, may incorporate or be a component of interactive content, such as simulations, and contain within the image object, sufficient descriptive information to provide an accessible rendering in multiple, explorable modalities, whether spoken, sonified, tactile, haptic, or 3D printed.

This proposed symposium will bring together researchers engaged in exploring Smart Images and accessibility and seek to identify developments likely to have impact on ongoing standardization efforts. In addition, opportunities for research and collaboration will be highlighted.


Images have been a part of the Web since its early days, as has been a basic mechanism for providing a textual description in the form of the alt attribute (see Textual description of images with informational content has, nonetheless, been a longstanding accessibility challenge. Though guidance on the use of the alt attribute is well established and content authors are better informed on its use, there remain numerous examples of images without alternate text descriptions or with inappropriate descriptions in the wild.

Many images require descriptions beyond what can be provided with the alt attribute. Examples include complex diagrams and charts. Longdesc, first proposed in 1997, has provided an at times controversial mechanism to associate a rich, text description with an image. Implementation of both authored content and user agent support has been limited, and the longdesc attribute was dropped from the HTML5 specification and subsequently reintroduced as an extension specification [6]. An alternate approach to providing image descriptions have been proposed using WAI-ARIA, with no clear consensus as to whether longdesc or ARIA is the better approach. The availability of applicable research on presentation and interaction with image descriptions has been a missing link in providing guidance to those promoting either of the two approaches.

While bitmapped images (e.g. JPEG, GIF, PNG) have been dominant formats on the Web, the native support of SVG in user agents is growing, as is the use of the HTML5 canvas element. With structured image formats, the possibility exists to incorporate accessibility information within the image itself and expose that information to users of assistive technology. Examples of SVG-based accessible graphics have been developed, with the goal that such images can be easily navigated using assistive technologies.

With images presented via SVG and canvas, the possibility exists for scripted interactivity to enable, accessible simulations commonly found in STEM related educational Web content. For example, an author may create an interactive image demonstrating the states of matter. Molecules within a pressure vessel will exhibit changing behavior as a substance moves from a solid, to liquid to gaseous state. As the state changes, can the appropriate description be made available to assistive technology? Can assistive technology programmatically query the number of molecules, their position, and velocity?

A further step is the inclusion of multimodal information in images, including audio, textual, and tactile alternatives for visual elements. Sonification is one technique for exploring image content non-visually through either embedded audio or through tools that analyze the image and present an audio soundscape of the content. Haptic technologies are also being explored to allow shapes and textures to be perceived. The advent of 3D printers opens the possibility to create tangible objects that represent an image.


A number of research topics and questions can be raised:

  • Understanding requirements for image descriptions across disability types
    • Are image descriptions useful for everyone?
    • Can user profiles pre-select appropriate descriptions (if available)?
  • Image Description User Interface
    • How should image description availability be indicated?
    • How should image descriptions be presented (visually and non-visually)?
  • Structure, Navigation, and Synchronization of Descriptions
    • How important is navigability of the description?
    • For spoken presentation, should the corresponding part of the image be highlighted?
    • Should image descriptions be searchable?
    • How should structured images be navigated, and how should the structured be exposed?
    • Is the current ARIA specification sufficient or are extensions needed for image descriptions?
    • Mapping visual salience to structure?
    • Can eye tracking evidence inform the structuring of image descriptions?
  • Smart Images/Objects and Automatic Description
    • Should images have a standard API for querying "content"?
    • Can images be "self-describing"?
    • Can libraries of "objects" be used to create accessible images?
    • Can artificial intelligence and image processing techniques lead to automatic creation of descriptions?
  • Education and Images
    • Images, such as graphs, may require different descriptions based on the grade level of the student or whether the image is used for assessment. How can these different requirements be encoded?


1. Accessibility Features of SVG. W3C Note 7 August 2000

2. W3C SVG 1.1 Appendix H: Accessibility Support

3. ARIA Markup

4. W3C Accessible Infographics Community Group

5. The Diagram Center Digital Image and Graphic Resources for Accessible Materials

6. HTML5 Image Description Extension (longdesc)

Altmanninger, K., & Wöß, W. (2006). Dynamically generated scalable vector graphics (SVG) for barrier-free web-applications. Computers Helping People with Special Needs, 128-135.

Cayton-Hodges, G. A., Marquez, E., van Rijn, P., Keehner, M., Laitusis, C., Zapata-Rivera, D., & Hakkinen, M. T. (2012, May). Technology Enhanced Assessments in Mathematics and Beyond: Strengths, Challenges, and Future Directions. Invitational Research Symposium on Technology Enhanced Assessments, Washington, DC

Fredj, Z. B., & Duce, D. A. (2007). GraSSML: accessible smart schematic diagrams for all. Universal Access in the Information Society, 6(3), 233-247.

Gardner, J., & Bulatov, V. (2004). Directly accessible mainstream graphical information. Computers Helping People with Special Needs, 626-626.

Giudice, N. A., Palani, H. P., Brenner, E., & Kramer, K. M. (2012, October). Learning non-visual graphical information using a touch-based vibro-audio interface. In Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility (pp. 103-110). ACM.

Goncu, C., & Marriott, K. (2011). GraVVITAS: generic multi-touch presentation of accessible graphics. Human-Computer Interaction–INTERACT 2011, 30-48.

Goncu, C., Marriott, K., & Hurst, J. (2010). Usability of accessible bar charts. Diagrammatic Representation and Inference, 167-181.

Summers, E., Langston, J., Allison, R., & Cowley, J. Using SAS/GRAPH to create visualizations that also support tactile and auditory interaction. In SAS Global Forum 2012.

Back to the list of topics.