Bug 21501 - Advice to conformance checkers section
Summary: Advice to conformance checkers section
Status: CLOSED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: HTML Image Description Extension (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Charles McCathieNevile
QA Contact: HTML WG Bugzilla archive list
URL: http://www.w3.org/TR/2013/WD-html-lon...
Whiteboard:
Keywords: a11y, a11y_text-alt
Depends on: 23288
Blocks:
  Show dependency treegraph
 
Reported: 2013-04-02 00:42 UTC by Leif Halvard Silli
Modified: 2013-09-19 14:00 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Leif Halvard Silli 2013-04-02 00:42:32 UTC
Suggest adding a section about advice to conformance checkers.

JUSTIFICTION: The spec would gain a lot of credibility if it would contain good recommendations to conformance checkes so that as many as possible of the errors created of the past would be catched via conformance checks.

Already, the spec allows conformance checkers verifh that the URL is valid and (soon) that it is non-empty - this is an huge improvement. But is it possible to get conformance checkers to do more than that?

Of course, as long as the URL is valid and non-empty, one cannot really issue error messages. But it ought to be possible to issue certain warnings.

SUGGESTED ADVICE to give in that section:

* Recommend to warn if the longdesc URL does not contain a
  #fragment URI *and* points to a  top level site.
  (Thus, a double criterion for this warning.)
  Justification:  Longdesc URLs that e.g. points to top level
  domains have been pointed out as an issue.
* Recommend to warn if file suffix of the longdesc URI is
  identical with the file suffix of the @src attribute resoure
  (Justifiation: This hints that the longdesc URI points to 
   another image rather than to a description.)
* Recommend to run link checks - checks for rotten/broken links
  (404 messages etc)

More/Fewer things?
Comment 1 Charles McCathieNevile 2013-04-02 10:35:06 UTC
I thought about this. My thinking was not to do it. What a conformance checker needs to know is:

- there should be zero or one instances of the description;
- the content should be one non-empty valid URL;

Which is already in the spec. I think the rest goes into best practices and techniques, and I don't think we should fill the normative spec with them.

I propose to resolve as wontfix
Comment 2 Leif Halvard Silli 2013-04-02 11:51:47 UTC
(In reply to comment #1)
> I thought about this. My thinking was not to do it. What a conformance
> checker needs to know is:
> 
> - there should be zero or one instances of the description;

Hm ... Let me see if I understood that comment correctly:

1) if the description is internal to the page, then the validator can easily check that it points to a fragment that exists. The validator already performs that kind of checks for many attributes, such as for @aria-describedby: http://tinyurl.com/ce3z9t2 That said: Allthough it is already possible, the validator doesn't currently run such tests for longdesc: http://tinyurl.com/cwgqltf And it is even an issue that I forgot to mention in my proposals above. Consider it proposed, now!

2) if the description is a data URI, then the non-empty URL and the 'fragment that exists' issue is one and the same issue. Of course, this issue is already solved ... Though, since a valida data URI should/would also contain MIME informatio, data URIs allows conformance checkers to cry out if the data URI points to an image (instead of pointing to an accessible description). I forgot to add this in my proposals above. Consider it proposed, now!

3) if the description is on an external page, then validators are not known to run checks that check whether linked resource do exist etc. Plus that for longdesc, we have the extra issue that valid URLs have been misused to present irrelevant content. That's the issues my proposal above seeks to fix.

> - the content should be one non-empty valid URL;
>
> Which is already in the spec. I think the rest goes into best practices and
> techniques, and I don't think we should fill the normative spec with them.

I disagree that adding advice to conformance checkers would hurt the feet of best practice and tecniques documents. Also, HTML5 contain advice to conformance checkers.
Comment 3 Leif Halvard Silli 2013-04-02 12:01:45 UTC
(In reply to comment #2)

Thus I add to my proposals above:

* when longdesc says longdesc="#fragmentInThisPage", 
  then conformance checkers to check that @longdesc
  points to a fragment that exists - validators
  already do this for @usemap: http://tinyurl.com/c7kxre7

* when longdesc contains a data URI, the some MIME types
  should cause warning or perhaps even error message
Comment 4 Leif Halvard Silli 2013-04-03 10:06:22 UTC
The lottery article also has the following to issues, which should be easy checkable for the validator:

* [longdesc] points to the image itself (i.e. the same URL 
  as the src attribute)
* [longdesc] points to the page you're already on


A double question to you, Charles: 

(1) Are you _JUST_ opposed tohaving the validation advice I propose in the spec, but have nothing against, and would actually prefer that the validator *do* check several of the things I propose?

(2) OR are you _BOTH_ opposed to the validation advice _AND_ opposed to any validation of the things I have proposed? (Needless to say, (1) would be easier for me to understand than (2).)


THe Lottery article says that in 96% of the cases, the longdesc:

1. is blank
2. is not a valid URL
3. points to the image itself (i.e. the same URL as the src attribute)
4. points to the page you're already on
5. points to the root level of another domain
6. is the same as a parent link's href attribute (i.e. the longdesc is redundant because you could just follow the image link instead)

Assuming that the Lottery article lists those things in order of frequency (which seems likely to me), then the longdesc spec, as we know, covers validation of 1st and 2nd issue above. But the 3rd and and 4th issue - I don't see why we would need a Best Practice document before starting to give warnings about those. I agree that with the 5th issue, we are wading into problems that perhaps are harder to define than issue 3 and 4.

For the 6th issue, then I don’t know what validator should do, at the moment, and the issue in this case is of another kind than the issues 1 to 5.
Comment 5 Charles McCathieNevile 2013-04-03 11:07:08 UTC
I think a lot of what you are describing is best practice (although in some situations things that aren't best practice might nevertheless be legitimate - e.g. a short description might be fine in a text/plain data: url, but describing a complex infographic might well require a table or two of data, and enough structure to navigate around).

I am against filling the spec with stuff that is purely meant to answer the longdesc lottery article. For most users, that will be unnecessary extra content.

Another best practice would be to check that the language of the longdesc matches the language of the source document. And another would be to check if it negotiates to match the users preferred language even if the page itself doesn't (this is one of the cool things you can do with external longdesc links).

But I am not interested in putting all that into the spec...

I'm coming to the idea that it would be useful to write a specific best practices document. Note that there are validators that linkcheck - I just don't know of any free products that do that.

Looking at the list:

We answer 1 and 2 (so did HTML 4)
3, 4 and 5 have legitimate use cases so should only throw warnings anyway. Describing the kind of warning and the use cases is definitely better in a Best Practices document IMHO.
6 is a non-problem. There is a clear difference between having a link somewhere on a page, and having an explicitly associated link, in terms of allowing user agents (which include things like search engines) to unambiguously associate the description with the image.
Comment 6 Leif Halvard Silli 2013-04-03 12:22:00 UTC
(In reply to comment #5)

> I am against filling the spec with stuff that is purely meant to answer the
> longdesc lottery article. 

I use that article as source of information. I also assume that answering the issues it takes up, will be good for the acceptance of the longdesc spec. Are there other articles that describe the problems longdesc has had any better?

> For most users, that will be unnecessary extra content.

Is this a spec for Web authors? For validation developers? For users? UA and AT implementors? I suppose that by 'user' you meant 'reader of this spec' But this particular bug is only about readers that are conformance checker developers.

My question was: Do you consider it valid (sic) that validators check (most) of the things I have proposed? If so, I could file bugs against the validator. After all, the validator checks that @usemap=#foo points to a <map name=foo> on the same page, but it does not check that <a href=#foo> points to anyting on the current page. So it is clearly a bit up to the validator developers what they find that it makes sense to check. I agree with the validator devs that it is obvious to check that <map name=foo> exists for <img usemap=#foo>, but probably not sensible to check <span id=foo> existt for <a href=#foo>. But if I would get no support if I asked the conformance checkers to add such checks, then I should not ask. The best support would be to have a reasonable spec to point to.

> Another best practice would be to check that the language of the longdesc
> matches the language of the source document. And another would be to check
> if it negotiates to match the users preferred language even if the page
> itself doesn't (this is one of the cool things you can do with external
> longdesc links).

I guess, in theory, then for 

    <img src="image" longdesc="image" alt="alt" />, 

the longdesc could, through content negotation, lead to image.html. That is why a warning is probably better than an error. Speaking about users - or simplistic authors, then longdesc has been described as a *simple* solution. I believe that the *advanced* authors will be able to ignore the warning cause by longdesc="image". 

As for language, then where would you draw the line? If a Norwegian Bokmål page points to a NOrwegian Nynorsk page, should it be an error? There are 8000 - 9000 language tags.

> But I am not interested in putting all that into the spec...

The best/optimal shouldn't be the enemy of the good/OK.

> I'm coming to the idea that it would be useful to write a specific best
> practices document. Note that there are validators that linkcheck - I just
> don't know of any free products that do that.

The NU validator checks that @usemap is used correctly - see above. The HTML4 validator does not. Certian things makes sense to have the ordinary validator check. We should not need to pay for such checks. Etc.

(Btw, the NU validator also have an extra 'image report check' - could of course be valid to place the warnigns in that report.)


> 3, 4 and 5 have legitimate use cases so should only throw warnings anyway.

Warnings is what I suggest.

> Describing the kind of warning and the use cases is definitely better in a
> Best Practices document IMHO.

The point is to say in the spec that it is allowed to create warnings for things that are likely errors or useless. 

> 6 is a non-problem. There is a clear difference between having a link
> somewhere on a page, and having an explicitly associated link, in terms of
> allowing user agents (which include things like search engines) to
> unambiguously associate the description with the image.

I suppose that you was that issue described in 6 was exactly equivalent to this: 

   <a href=longdesc><img longdesc=longdesc src=a alt=alt></a>

With my imperfect experience of how screenreaders work, the longdesc in that example,  is currently useless, even if the screenreader supports longdesc. It is even outside what HTML4 envisionaged, as HTML4 talked about longdesc pointing to a place *different* from where the parent <a href=foo> points.

 If you bring in search engines, then you are also opening up for a new kind of use of @longdesc - a legitimate use of longdesc in order to affect search engines, as there obviously many descriptive images that are image links.
Comment 7 Leif Halvard Silli 2013-04-03 12:27:04 UTC
Note:

>  If you bring in search engines, then you are also opening up for a new kind
> of use of @longdesc - a legitimate use of longdesc in order to affect search
> engines, as there obviously many descriptive images that are image links.

I did not mean to say that it is legitimate - I more meant to say that you would be legitimating longdesc as a way to affect search engines.
Comment 8 Charles McCathieNevile 2013-04-03 12:42:26 UTC
Using longdesc to affect search engines isn't new :)

Things that are not forbidden by the spec are allowed (and in practice things that are required by the spec may not happen, while things required not to happy may happen anyway).

In general I am not disagreeing with the suggestions, just resisting adding them all to the spec since I don't think they are critical.

And I am not going to enter into the discussion about what validation *should* be available for free. As representative of a company that provides these services, in some cases for free, I note that there is a concrete cost in performing them. 'nuff said.
Comment 9 Charles McCathieNevile 2013-04-12 16:53:48 UTC
It turns out that clarifying which requirements are relevant to whom, I have added some advice. I am still trying to keep it simple, and plan not to include a lot of information on the sort of heuristics that validators might want - I think the proper place for that is in ATAG techniques, plain technical articles, and so on.

I do not intend to add significant further advice based on this bug (this doesn't preclude that some specific piece of advice gets added later).
Comment 10 Charles McCathieNevile 2013-05-01 10:23:07 UTC
The TF agreed that the current editor's draft status is sufficient, and resolved the bug.