From HTML WG Wiki
Revision as of 10:11, 31 August 2010 by Mturvey (Talk | contribs)

Jump to: navigation, search


HTML5 Issue: Longdesc Retention


HTML5 obsoletes the longdesc HTML 4 attribute. The LONGDESC attribute's functions are:

  1. A secret, direct, reusable programmatic mechanism to a long description of an image.
  2. A method to reference a longer description of an image, without including the content in the main flow of a page.

In some circumstances, it may not be feasible to use one of the other available image long description techniques. For instance, longdesc currently provides a solution for describing the content of images to the blind when it would be:

  • Visually apparent and redundant to a sighted person.
  • Unacceptable to the marketing department due to aesthetic considerations.

There is currently absolutely no other direct way of doing that without a longdesc, except for using an explicit link on the image itself, or using a hidden text link.

This is HTML ISSUE-30


The purpose of the LONGDESC attribute is to specify "a link to a long description of the image" (HTML4: Objects, Images, and Applets). It is typically an accommodation for the blind and visually impaired.

It can be used to "provide information in a file designated by the longdesc attribute when a short text alternative does not adequately convey the function or information provided in the image" (Techniques for WCAG 2.0, H45: Using longdesc).

An image long description may be redundant to sighted people, though sighted people with/without other disabilities may also benefit from access to long descriptions in some situations.


  1. A programmatic mechanism to reference a specific a structured description, internal or external to the document.
  2. A way to inform users and authors that a description is present/available via user agent (UAs could provide an option to reveal the content of describedby via a context menu, preference, or switch etc.). This also affords a practical method for the curious and for developers who want a tool to check the describedby descriptive content and keep it up to date.
  3. A device independent way to access the descriptive content.
  4. An explicit provision that accessing descriptive content, whether internal or external to the document containing the image, does NOT take the user away from the user's position in the document containing the image where the verbose descriptor was invoked.
  5. A way to provide user control over exposition of the descriptor so that rendering of the image and its description is not an either/or proposition. A visual indicator of the description should NOT be a forced visual encumbrance on sighted users by default.
  6. A method to reference a longer description of an image, without including the content in the main flow of a page.

In situations where images are not available to the user (because of disability, choice, or UA limitation) there is a need for a mechanism that presents equivalent content to the user, either as an alternative to the image or in a side-by-side exposition. Equivalent content is not, nor should it be, and either/or proposition, and its method of exposition should be subject to user control, as some user groups may need both the image and its detailed description in order to make sense of the image or - in the case of a user with an extremely small viewport - to follow the image's flow.


Recommendation and Polls

LONGDESC Issues & Change Proposals


Use Cases

  • Data Visualization i.e. Charts and Graphs
  • Diagrams
  • Cartoons
  • Drawings, Illustrations
  • Paintings
  • Photographs
  • Maps


The following consumer groups are amongst those who have a need for semantically rich, structured and often lengthy descriptions of images whenever these images are not decorative or simply icons Note that even for icons some text equivalent is necessary for such users to understand the document. However, the text equivalent in the case of an icon, may be satisfied with a short and not necessarily structured phrase.

For the case of non-decorative images, the users in these groups cannot comprehend the meaning of the document through the typical visual interpretation of the images. Instead, when a user within these groups wants to comprehend the images and their role in the document, an alternate text or audible mechanism is needed to be able to read or listen to the content.

Blind Users

Blind users are unable to view the original image so need enough fallback content to replace the entire function of the image within the page. Providing a short summary of the image in cases where its overall content is complex may allow them to choose to read a more detailed description if the image is of interest.

Legally Blind Users

The term "legally blind" is ascribed to individuals with a visual acuity of 20/200 or less. This means that there is a wide spectrum of users, with a wide spectrum of visual acuity less than 20/200, covered by the term "legally blind"; from those who can perceive only the starkest of contrasts (white upon black), those who inhabit a grey shadow world, to those to whom the world is a whirl of mostly indistinct objects and colors. Some users classified as legally blind will be able to use and interact with some portions of a document instance's graphical layout, mostly through the use of screen-magnification software and keyboard input. The term "legally blind" also covers what is commonly known as "tunnel" or "straw" vision, meaning that the individual can only receive visual stimuli through an extremely restricted viewport; hence the name, "straw vision", for it has been described as having one's perception of the visual world limited to a viewport the size of the circumference of a drinking straw. The term also covers those who have no useable vision except for peripheral vision. Some "legally blind" users will rely on screen-magnification alone, others either need or prefer supplemental speech-synthesis.

Low Vision Users

Low Vision Users are a distinct class of users. They include those whose eyesight is deteriorating or has deteriorated, but whose visual acuity remains higher than 20/200. This means that a low vision user can sometimes discern the graphical layout and composition of a page, but who either cannot use a mouse effectively for navigation and interaction, or who finds it less painful to use a version of a document instance that does not rely on graphics. This can vary from graphic to graphic, and according to various external circumstances (lighting, time of day, etc). Despite its lack of general acknowledgement, "low vision user" is a term which contains a very broad spectrum of users.

Users on low bandwidth / high cost connections

Users who wish to conserve bandwidth may turn images off. The page should remain intelligible and should provide enough information about any missing images that they are able to decide whether to download a particular image.

Users with non-graphical UAs

Users with non graphical UAs should still be able to understand the contents of a page.

Users of data-mining tools

Users of data mining tools such as search engines are not only interested in understanding the meaning of a single page that has meaningful content conveyed visually through images, but instead wants to perform queries, selections and other operations on many documents. Often they may be interested in operating on documents containing one or many images based on many properties not easily determined by machine processing such as: subject matter, context, content and composition. While machine processing may be capable of processing images based on technical properties — colors, gamma, aspect ratio, bit density and depth, etc — processing image content based on subject matter is still not possible.

While often data mining tools may be able to process text around an image to guess at its content and composition, this is a poor substitute for text descriptions aimed explicitly at describing an image and its place in a document. For example, an image of an eagle may be used in a document describing wildlife that makes no explicit mention of an eagle or even birds in general. The data mining analysis would have to expand its parameters to include wildlife to find this image of an eagle. However, such a expansion of the parameters would lead to a unnecessarily large set of data and hinder the goals of the user. So for this group, users want to perform standard mining of data relevant to images contained in HTML documents that yields desirable results to queries and selections and sorting based on the subject matter and the context of the images.

Authors trying to target users in any of these groups

Users, in a role as HTML authors, may want to produce content that can be consumed by any user in the above groups.

Longdesc Examples in the Wild

Examples Without a Link on the Image, No Fallback

Examples Without a Link on the Image, With Hidden Text Link

Examples Without a Link on the Image, With Visible Text Link

Examples Without a Link on the Image, With Hidden D Link

"D" links are a type of link text used in conjunction with a longdesc.

Examples Without a Link on the Image, With Visible D Link

Examples Without a Link on the Image, With Spacer Image Link

Examples With a Link on the Image to the LONGDESC URL

Examples With a Link on the Image to a Different URL, With Visible Text Link

Laws, Policies and Standards


Policy and Standards

Existing Support for LONGDESC

User Agents:

  • Beginning with Opera 10.1, longdesc is supported natively by Opera
  • JAWS for Windows (a widely used screen-reader for the Windows platform) has supported LONGDESC since JAWS version 5.
  • Window-Eyes - "Activating Long Description URLs When a user encounters a long description on an image, they have the option of pressing ALT-ENTER to active the URL specified in the long description. Window-Eyes will load the specified URL in a new instance of the browser window."
  • iCab natively supports LONGDESC and also endows the user with the ability to follow the CITE attribute, when defined for either Q or BLOCKQUOTE. (source: Sander Tekelenburg, post to public-html, 30 June 2007) Clarification: I didn't mean to imply that this has anything to do with its longdesc implementation. iCab always indicates an anchor when you follow a URL with a fragment identifier. It does, because the usual UA behaviour of merely scrolling there doesn't work if the anchor is at the bottom of a page. (source: Sander Tekelenburg, post to public-html, 1 July 2007)
  • In Netscape 6.2 and 7.0 for the Macintosh platform, the longdesc URL is a clickable link in the image properties window. The link opens in the same window, with no other visual indicators once the page loads. (source: Bill Mason, post to public-html (30 June 2007)
  • there is a LONGDESC extension for FireFox. This extension can be used with either FireVox or Jon Gunderson's Firefox Accessibility Extention (FAE), which also makes any long descriptions available to the user via a menu.
  • There is also an old LONGDESC extention/widget for the Opera browser. Note that this extention also exposes and makes available the CITE attribute for Q and BLOCKQUOTE.
  • Firefox "View Image Info" displays the longdesc link.

Authoring tools:

  • Dreamweaver
  • XStandard

Proposed and Existing Solutions

<img longdesc="">


  • Provides a mechanism for using either an external resource or part of the existing page as a description
  • Already supported by some UAs
  • Content is invisible-by-default in image-supporting UAs. Designers have generally described the ability to do this as a feature.


  • Little correct use in practice
  • External content adds a maintainance burden (likely to get out of sync)
  • Long text alternative is not available to all users.
  • The longdesc attribute is only supported by a limited number of browsers and assistive technology
  • The content of the page referenced by the longdesc is not available to all users




  • aria-describedby is currently limited to text that appears in the same document as the image being described.
  • The content associated using aria-describedby as currently implemented, is limited to unstructured text.


image maps



<img alt="">


  • Widely supported by UAs
  • Widely used by authors
  • Authoring "best practice" favours the use of @alt to provide equivalent content that can be included inline.
  • Provides potentially easy UA and user access to a brief summary description of embedded content
  • Widely implemented in policy requirements for accessibility


  • No rich markup, so it is only suitable for a single paragraph or idea to describe an image
  • By convention only used for a short summary - not sutiable where more detail is needed
  • Often misunderstood/misused by authors
  • Wrong, unuseful and inconsistent implementations across UAs make @alt less useful for users and contribute to confusion amongst authors. See test case and recommendation

It will be assumed that other mechanisms can be combined with @alt to provide a short summary/long description pair

based on test case showing a.o. non-interoperability of @alt implementations and discussion

Proposal for Improving ALT

Add the following to the spec's definition of alt:

  • Authors must use no more than n characters (or words) as the value of the alt attribute. For longer alternatives authors may use longdesc.
  • UAs must make at least the first n characters (or words), of the alt attribute easily discoverable and available to users when the image is, for whatever reason, not presented. (For example, UAs may present the alt text in place of the image; or through a tooltip or in a status bar on hovering the indicator of the missing image; etc.)
  • UAs must make at least the first n characters (or words), of the alt attribute easily discoverable and available to users when the image is, for whatever reason, not presented. (For example, UAs may present the alt text in place of the image; or through a tooltip or in a status bar on hovering the indicator of the missing image; etc.)
Counter Argument in Favor of Unlimited ALT Text

While there is a need for brevity in ALT text, it must also be remembered that (a) the limit on the number of characters in a tooltip is potentially very high; and (b) that requiring user agents to ignore the part of the ALT text that exceeds a smaller limit, is potentially a problem. Yes, the goal in using ALT is to provide a quick alternative to, for example, a graphically defined link, but it isn't the graphic that the user is most interested in, but the functionality it represents; forcing user agents to discard ALT text longer than n characters would compromise the purpose of ALT text. (GregoryRosmaita:)



  • Fallback to rich child content
  • Included in single document but not-visible to image-supporting UAs
  • Some browser support
  • Support in the most recent browsers is reaching feature complete


  • Browser interoperability is poor
  • Authoring may be more difficult than using an IMG element



  • Potentially the same fallback mechanism as <object>
  • May be more intuitive for authors


  • No existing implementations
  • Hard to make backward-compatible will not easily degrade gracefully.

<a href="fallback.html"><img></a>


  • Fallback accessible in all UAs
  • Works now, for everyone
  • Does not require software upgrades, user training or author outreach
  • All users are familiar with clicking on an image for more information
  • Accessibility should not be a secret


  • Not suitable for images that are also links
  • May confuse users not expecting the image to link to a description of itself
  • May discourage authors from using this technique due to not wanting to confuse users

<img>fallback</img> in XML serialization

For XML serialization we have the opportunity to specify UA conformance in advance.


  • Allows semantically rich and media rich fallback content
  • <img> is already familiar to authors
  • Works in existing visual XHTML UAs
  • Degrades gracefully in that visual UAs do not display the contents of the element.
  • Potentially compatible with authors authoring to XHTML2 conformance


  • "fallback" not displayed in any graphical UAs at present (just as with the current <img longdesc>)
  • XML and HTML serializations are forked for img fallback.
  • "Fallback" will be need to be translated when converting an HTML5 document serialized as XML to the HTML serialization.

CSS3 "content" property

CSS3 draft


  • Allows semantically rich fallback content
  • Falls back gracefully in existing UAs that do no support images
  • Outside the scope of HTML specs, and can be used regardless of the status of HTML specifications


  • Does not work in most current graphical UAs
  • Hard to make backward compatible by degrading to <img> before degrading to rich content
  • Undermines the separation of concerns by including semantic data in the style sheet

CSS image replacement techniques

Using Background-Image to Replace Text


  • Requires no changes to HTML or UAs
  • Proven to work


  • Difficult to implement
  • May be inaccessible in some circumstances
  • Undermines the separation of concerns by including semantic data in the style sheet

<video> element for still image (single-frame video)

CSS3 draft


  • Allows semantically rich fallback content
  • Works in any HTML5 conforming UA
  • All of the advantages of <video> when used for <video> (see WhatWG archives)


  • Does not work in most current graphical UAs


Gregory J. Rosmaita's Original Rationale for Retention

There are many compelling arguments for the retention of the LONGDESC attribute, as defined in HTML 4.01 Strict. A mainstream arguement for LONGDESC is that there is a moral and often legal need for it amongst academics, educational institutions, and government entities, as more and more course content migrates to the web or intranets, equal access demands that they provide a meaningful long description.

Academics constantly complain to me that if they are to teach students without vision or with very low vision, they need more than ALT or CAPTION -- they need to describe the subtleties of the image being presented as content for those who cannot see the content, and those who have found a longdescription helpful, as a key to the symbolism contained in the image; or as a means of expounding on a static image of a map (such as of a migration, a battlefield, a schematic of a subway system, etc.)

The following is an example of the difference between a caption for an image, and a long description of that image. The image in question is an image of the British flag; the occasion? As an illustration accompanying an article on the 200th anniversary of the formation of the United Kingdom. Such a caption might read:

The Flag of Union has been the official flag of the United Kingdom since the Act of Union of 1807, which created the modern political entity known as the United Kingdom, which, this year, celebrates its 200th anniversary.

Now, compare that to the following LONGDESC:

The Flag of Union has been the official flag of the United Kingdom since the Act of Union of 1807, which created the modern political entity known as the United Kingdom, which, this year, celebrates its 200th anniversary.

The flag of the United Kingdom is commonly known as the Union Flag, or Union Jack. It is the national flag of the United Kingdom of Great Britain and Northern Ireland. The flag's design dates from January 1, 1801, as a symbol of the Act of Union of 1800, which merged the Kingdom of Ireland and the Kingdom of Great Britain (until 1707, the United Kingdoms of England and Scotland), to form the United Kingdom of Great Britain and Ireland.

The flag symbolically uses the national flags of England, Scotland, and Ireland to form a single flag comprised of:

  • the flag of Scotland, which bears Saint Andrew's cross: a white X on a blue field; and
  • the flag of Ireland, which bears Saint Patrick's cross: a red X on a white field;
  • the flag of England, which bears Saint George's cross: a red cross on a white field;

The flag of Scotland forms the bottom layer of the Union Flag. Over Saint Andrew's white cross, the red cross of Saint Patrick is superimposed, on top of which is a white-bounded red cross of Saint George.

Now that is a world of difference. A caption pre-supposes that one can also perceive the object being captioned (that is, put into context); just as a TABLE without a summary pre-supposes that one can also perceive the data sets being table-ized.

Just as the contents of the summary attribute can be reused to provide a visual rendering of the summary's contents, so too can a LONGDESC be yanked into an IFRAME (not my preference) or embedded as an OBJECT by the browser, so as to replace the image inline. (Note: the browser should offer at least the following choices: show images, show LONGDESC, show ALT text, but both ALT and LONGDESC should be available whether image loading is turned on or off, something over which, in a locked-down setting, the user may have no control)

LONGDESC, would, perhaps, have more mainstream support if the attribute used to point to a long description was HREF and not LONGDESC -- even HREFDESC would have been a more implemented iteration of LONGDESC as it is defined in HTML 4.01 (where hrefdesc is an attribute such as hreflang)

The unquestioning bending to the marketplace's will, by claiming that, since LONGDESC is not widely implemented, it should be deprecated, when it must be remembered that one of the major reasons it is not more widely supported by mainstream apps, is due to simple "market realities" -- there aren't enough of us who need LONGDESC and summary to market to, and therefore, any additional work that would inherently increase the accessibility of the product isn't needed or is assumed to be exclusively a third-party slash assisstive technology's responsibility.

Every day you age, your eyesight becomes a little less sharp than it once was, and if you survive to a ripe old age, you, too, may be dependent upon summaries for tables and long descriptions for graphical objects...

source: Gregory J. Rosmaita post to public-html, 24 June 2007

Usefulness of LONGDESC in the Digitization of Books & Historical Works

LONGDESC is indispensable for anyone attempting to perform serious academic work via the web. Increasingly, colleges and universities are incorporating online ciriccula into all aspects of learning -- on campus, off-campus, long-distance, etc. In many jurisdictions, this means ensuring equal access to all course content - consult: Policies Relating to Web Accessibility

What follows is an example drawn from real life:

When encountering a portrait of Lord Cornwallis, it isn't sufficient to simply caption the image "Portrait of Lord Cornwallis, ca. 1774" -- the student of the subject needs to know precisely how Lord Cornwallis is portrayed -- how old was he at the time of the portrait? what kind of hairstyle does he sport? What type of uniform? What do the buttons on the uniform signify? What is his rank, based on the eppalettes? What are the items that are included in the portrait, particularly those held by, or within reach of, the portrait's subject, for all such items have both symbolic and highly specific meanings, all of which the painter assumed would be understood by the viewer.

Any reference material worth its weight in bytes must include LONGDESC so that the specifics of the image can be conveyed as completely and as thuroughly as would a careful, informed study of the actual portrait.

source: Gregory J. Rosmaita, post to public-html (24 June 2007)

Monika Trebo

I think the fact that something potentially useful (in the case of the longdesc for people with special needs) is not widely used or not used properly, should not be a reason to abandon it. Validation is not used widely either, to my knowledge about 95% of html out there is invalid, and none of us would consider dropping it.

The longdesc may not be used because people don't know about it and it's proper use. Why don't we come up with a brief explanation eg. as part of the "Tips for Webmasters".

An HTML editor which is widely used at Stanford prompts users to enter alt and longdesc when inserting images etc. It is an accessibility issue.

Source: post to public-html 22 June 2007

Strategies for Exposing LONGDESC

The first thing a screen-reader, or other assisstive technology, must do when it discerns the presence of a longdesc target is: alert the user that it is there.

The second thing a screen-reader or other assisstive technology must do when it discerns the presence of a longdesc target is to allow the user to activate that target, if that is the user's wish, so as to expose the contents of the longdesc document. Ever since it began to support longdesc, JAWS for Windows alerts the user to the presence of a long description, and prompts the user (in basic mode) to hit ENTER, and the contents of the longdesc document associated with the image is displayed in a pop-up window (not the best solution when the default for a lot of programs these days is block all popups) and the User Agent Accessibility Guidelines (UAAG) strongly discourages the opening of new browser instances without warning the user that it is about to do so, and without the option of opening in a new tab or in the viewport of the original document.

The last thing that needs to be done is to provide a mechanism to return to the document in which the described image is embedded.

Obviously, Step 1 is the responsibility of the assisstive technology, but the under-the-hood mechanics of exposing descriptive content SHOULD be the user agent's responsibility; This issue is directly addressed in UAAG Guideline 2, Checkpoint 2.5 -- a Priority 1 checkpoint

What is needed, therefore, is a normative list of recommended/expected actions that allow multi-modal interaction with the long description.

Treating LONGDESC as HREF isn't the only means of exposing the content of the long description page; the contents -- or the main portion thereof -- could be rendered inline instead of the image or in an IFRAME (which has its own accessibility issues) or any other number of means of exposure.

The key is that the UA should support LONGDESC natively, and allow the user a set of choices about exposing LONGDESC:

  • expose in new browser instance
  • expose in new browser tab
  • expose inline (insert content as object)
  • expose inline through the use of IFrame
  • expose the contents of the longdesc document in a side-bar, aligned with the image it describes

and there are many other options, provided a user knows what to do when encountering a long description, then it matters not what assisstive technology she is using, for there is an expected action in the case of browser x for exposing LONGDESC

source: Gregory J. Rosmaita, post to public-html (24 June 2007)


Further Research

Possible places to look for additional use cases and examples:


Simple Longdesc Examples

  • Test Results for Sander's Test of UA Support for LONGDESC
    1. using JAWS8 (version number: 8.0.2107) and MSIE7 (version number: 7.0.5730.11) on a WinXP Pro SP2 box, the longdesc is identified as available ("press enter for long description"), but when activated, simply causes a new browser instance to be generated, containing the entire contents of the original page (JAWS' clunky mechanism for exposing the target of a longdesc) ; i had to use JAWS' "List of Graphics" to access the UK Flag icon's longdesc, as it was not included in the tab-order.
    2. using JAWS8 (version number: 8.0.2107) with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/20070515 Firefox/, JAWS identified the graphic as having a longdesc ("press enter for long description"), but doing so results in NO change of viewport, nor the opening of a new browser instance (JAWS' clunky mechanism for exposing longdesc); i had to use JAWS' "List of Graphics" to access the UK Flag icon, as it was not included in the tab-order.
    3. using JAWS8 (version number 8.0.2107) with Lynx32 (release 2.8.6rel.1 libwww-FM/2.14FM SSL-MM/1.4.1 OpenSSL/0.9.8d) the UK flag was identified as a graphic; when "Show Images" is set to "as links", the link defined for the UK flag links directly to the raw icon (in GIF format) without either indicating, exposing, or enabling the exposure of LONGDESC

More Complex Longdesc Examples

Diagram Examples

The CSS 2.1 Specification provides long descriptions of the diagrams. The box dimensions diagram has an associated long description. The spec uses "[D]" links beside each image.

Photography Examples

Screenshot Examples

Example based on Hixie's Mini FAQ About the Alternate Text of Images:

  • Image: Screenshot of dialog box from Netscape Communicator 4.x
  • Alternate text: The proxy settings dialog box has 'proxy.i.edu' in the 'host' field and '3128' in the 'port' field for every protocol.
  • Title: Screenshot showing Communicator 4.x Proxy settings for Indianapolis campus.
  • Long description: The image depicts the Proxy Settings dialog box for the Communicator 4.x application as it is set for the Indianapolis campus. The dialog box has four rows of edit boxes, labelled HTTP, FTP, GOPHER and HTTPS. Each row has two edit boxes, aligned in two columns, with the labels 'host' and 'port' at the top. Each 'host' field has the content 'proxy.i.edu' and each 'port' field has the content '3128'.

Related References


June 2007

Thread: Rationale for Preserving LONGDESC

Thread: Usefulness of LONGDESC & the Digitization of Books & Historical Works

Thread: Exposing LONGDESC

Thread: LONGDESC Wiki Page

Thread: LONGDESC: some current problems and a proposed solution added to the wiki

July 2007

Thread: LONGDESC: some current problems and a proposed solution added to the wiki

See also