21493 – Describe the longdesc link rot issue and suggest how to combat it

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 21493 - Describe the longdesc link rot issue and suggest how to combat it

Summary: Describe the longdesc link rot issue and suggest how to combat it

Status:	CLOSED WONTFIX

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	HTML Image Description Extension (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P2 normal
Target Milestone:	---
Assignee:	Charles McCathieNevile
QA Contact:	HTML WG Bugzilla archive list

URL:	http://www.w3.org/TR/2013/WD-html-lon...
Whiteboard:
Keywords:	a11y, a11y_text-alt

Depends on:
Blocks:

Reported:	2013-04-01 16:46 UTC by Leif Halvard Silli
Modified:	2013-05-02 02:39 UTC (History)
CC List:	3 users (show)

See Also:

Attachments

Description Leif Halvard Silli 2013-04-01 16:46:16 UTC

Link rot is a well known problem on the Web: http://en.wikipedia.org/wiki/Link_rot

And the infamous Longdesc lottery article, and other contributions to the debate, have claimed that longdesc links are particulary prone to rot. One reason for this problem - it is being said - is that "hidden metadata" is particulary prone to rot.

And sure enough, it is just a waste of time - and a distraction, for the user, if a link intended for accessibility leads to a 404 page, or some such thing.

Therefore I suggest adding to the spec a section - or at least a note - about the longdesc attribute and link rot. This sections should briefly describe the following:

* the link rot problem in general
* why it is an extra problem for the longdesc audience
* suggest methods for avoiding the problem:
 # Conformance checkers MAY check for dead links/404 messages.
 # UAs SHOULD 'hightlight' the presence of a longdesc so that
   users - and authors - use and discover them, and thus react
   if they find that the longdesc links are rotten
 # Authors SHOULD be aware of the problem, and consider  
   strategies for avoding such rot:
  o use data URIs, to embed external descriptions in page itself;
  o place descriptiosn in a fragment of the page itself;
  o manage descriptions the same way images are managed:
    http://xstandard.com/en/articles/advanced-image-management/
  o using stable image description databases/services
    - http://rebuildingtheweb.com/en/longdesc-replacement/ 
    - http://objectdescription.org
  o use backlinks, from longdesc resource back to described image,
    so that it is easy to 'round trip'
  o keep the URIs cool: http://www.w3.org/Provider/Style/URI.html
  o more advice: http://en.wikipedia.org/wiki/Link_rot#Combating

Btw, it would be a good start of this section to point out that longdesc is in fact a link - a URI.

Comment 1 Charles McCathieNevile 2013-04-01 23:00:06 UTC

Yes, link rot is a problem. As is maintaining documents, paying for your domain name before someone takes it over, and various other things.

I don't think it is in the scope of this specification, but rather in the scope of best practices for website management.

I propose to close this as wontfix

Comment 2 Leif Halvard Silli 2013-04-02 00:23:43 UTC

Many have in the past maintained and paid for domains and web sites but ignored maintaining the sites’ longdesc descriptions.

Thus, the kind of rot we see with - once functional - longdesc links, is not really the same kind of rot that we see with "normal" anchor element links. Also, due to the intimate relationship between longdesc description and image, a rotten img@longdesc is more akin to a rotten img@src than it is akin to a rotten anchor element.

It seems to me that it would benefit the cause to admit the problem. I don't know, but there might be as many longdesc links that have rottened as there are longdesc links which contains texts instead of an URI.

But I of course don’t mind if, in this spec, the problem is described only very briefly, e.g. as an encouragement to authors to plan/implement robust longdesc solutions that are likely to survive site maintainance/upgrades/etc.

Comment 3 Charles McCathieNevile 2013-04-02 10:40:04 UTC

In my research rotted longdesc links are nowhere near as common as plain text descriptions, but of course looking at a different slice might give different results.

I think this is a basic problem of best practices in site maintenance. I don't see why the normative HTML spec should say "authors should be careful to re-check the values of attributes when they copy source code around", or "authors should think about how they are going to be sure that their content stays up to date" and I don't see why this spec should say that either.

Comment 4 Leif Halvard Silli 2013-04-02 14:27:07 UTC

(In reply to comment #3)
> In my research rotted longdesc links are nowhere near as common as plain
> text descriptions,

Handauthored? If a gallery CMS spits out incorrect longdesc-s, the numbers fast becomes high.

> but of course looking at a different slice might give
> different results.

The Longdesc lottery article places "not a valid URL" amongst the 96% of longdesc attributes where the longdesc URL is either empty or invalid. As such, your reserarch combined with what that articles says, have made me pretty convinced that you are right that there are more longdesc attributes with non-valid URLs which (how often?) are text, than there are longdesc attributes with rotten URLs.

> I
> don't see why the normative HTML spec should say "authors should be careful
> to re-check the values of attributes when they copy source code around",

It’s "authoring" whether one edits - or moves around - new or old documents/code. Thus there is no need to specifically point out "copy source code around" as an issue.

> or
> "authors should think about how they are going to be sure that their content
> stays up to date" and I don't see why this spec should say that either.

Made more stringent, that statement could be OK. Perhaps you would be OK with the following statement? Proposal:

   ]] NOTE: A longdesc URL is supposed to be usable regardless of
            the context (e.g. online, in e-mail or syndicated) in
            which the image occurs, and should be designed
            accordingly.[[

Comment 5 Leif Halvard Silli 2013-04-03 02:11:11 UTC

If you you say "won't fix", then I am going to disagree, but accept your decision.

Comment 6 Charles McCathieNevile 2013-05-01 10:16:10 UTC

The TF agreed to wontfix and is grateful that Leif already accepted this position