Warning:
This wiki has been archived and is now read-only.

TitleKeyContentMark

From HTML WG Wiki
Jump to: navigation, search

Change Proposal by Leif Halvard Silli‚
for ISSUE-206 meta-generator.
The location of this CP: http://www.w3.org/html/wg/wiki/TitleKeyContentMark

Recognize no-alt img elements with a key-content mark in the title attribute

Summary

Where the alt attribute has been omitted, the presence of a non-empty title attribute has a very positive effect on img elements with regard to whether the images get presented to end users or not. Not only does HTML5 (and ARIA) tell UAs and ATs to do make use of title in that situation, but testing shows that there is also a long and broad support for this as well. In fact: A no-alt img element with the title attribute set, has almost the same level of AT support as img elements with a non-empty alt.

By contrast, when both alt and title are omitted, then ATs (in particular JAWS, NVDA, VoiceOver+Firefox nighlty + VoiceOver when image don’t load) treat the image as if it had the alt attribute set to the empty string.

Therefore, for the use case of a markup generator in lack of alternative text for a key-part-of-the-content image, HTML5 should recommend developers to make their markup generators check for presence of the title attribute. When there is a no non-empty title present, the generator should add one — and fill it with a key-content mark.

A key-content mark is a carefully selected white-space character combination whose presence has the effect that — a — it allows ATs and UAs to securely assume the image to be key content and thus secures that the image presence is made known to the user and — b — makes the page formally validate. (The carefulness with regard to the choice of white-space characters, relates to the key-content mark’s effect on ATs as well as to whether it prevents or causes the title tooltip from displaying on hover or not.)

Rationale

Alternative text — ideals and reality

The ideal alternative text is contextual — adapted for the particular image. For that reason, it is also often assumed that the ideal alternative text has to be unique. However, the uniqueness expectation might be somewhat unrealistic/exaggerated — e.g. consider a photo album where many of the motifs are similar .

But reality is that authors consider the uniqueness ideal as to difficult and too much work and thus often resort to dummy text and that the motivation for doing so is linked to HTML’s traditional syntactical requirement — that there must be some kind of alternative text, empty or non-empty, for validity. However, it would be wrong to say that it is purely linked to the validity question as authors know that the validity question in turn is linked to the accessibility question.

The use of dummy text does however result in problems. Let us evaluate its 3 variants:

    1. The empty string as dummmy text has the problem that HTML5 formally attributes its semantics to be equivalent of <img role=presentation>. Thus, this method silences both validators and screenreaders.
    2. The non-empty dummy string solution, may cause user confusion: Screenreaders — and at least some user agents (e.g. Opera when the alt is omitted) already make use of “dummy texts” (such as “image” or “graphic”) when they present a loaded or (in particular) non-loaded image to their users. And it is therefore not useful to repeat the same announcement as alternative text as wel. Another problem with non-empty dummy text is that it is (usually) tied to a single language. In contrast, the announcements of an AT or UA typically changes with the locale.
    3. A special — and language indpedendent! — variant of the non-empty dummy text would be a white space filled alt. For instance, one could place the key-content mark (aka “white-space”) directly in the alt attribute. This would have the advantage that it would silence the validator while not silencing screenreaders. Screenreaders would instead use their built-in announcement text to announce the image, and there would be no repetition. However, while this would work OK in all the screenreaders tested while writing this proposal, it would interfere with ATs and UAs ability find a better way to present the image to the user. For instance, it would make Opera not render its alt repair text and instead render a white-space character as the image representation. Hence, the spec does understandably not endorse this method.

      As a conclusion, we could say that, for key-part-of-the-content images, then none of the dummy alternative text methods are recommendable.

      However, when one creates a markup generator for use by third parties, it becomes necessary to have a strategy for solving the use case of an img for which the third party failed to add alternative text. And the advice of the spec is that markup generators simply omit the alt attribute.

      This advice has thus to be understood on the background that the spec, for the condition that “the src attribute is set and the alt attribute is not“ (see the paragraphs on “what an img element represents”), describes what it represents and tells UAs and ATs to, in such cases, offer “some sort of indicator that there is an image that is not being rendered”.

      So in theory, the dropping of the alt does solve the problems related to the non-empty dummy string, since the omission of the alt attribute is supposed cause the image to be presented with the AT’s built-in dummy text — and nothing more. Thus, in theory, such an image would be handled by AT the same way they handle an image with a white-space filled alt (the third option above).

      Reality is, however, that — on one side — UAs indicate the lack of image even for img elements with the alt set to the empty string. And — on the other side — that many ATs treat no-alt images like they treat empty-alt images: They ignore them. (See the test page mentioned in the summary.) Why do they ignore them? Is it because 25% of img elements are lacking the alt attribute? Or is it only because they have not gotten around doing The Right Thing™ yet?

      At any rate: The degree to which the cited spec text can be taken to mean that no-alt img elements should be given special treatment, could be disputed. For example, it is not a particular strong statement when the spec says that a such image might be key part of the content” and that the lack of the alt attribute indicates that the image is a key part of the content”.

      Whatever the reason: A fundamental issue with the spec’s current advice to markup generators is that the lack of alt does, generally, not cause images to be presented. Which in turn begs the question about whether it wouldn’t be better to evaluate the 3 dummy alternative text options above and pick the least bad method?

      The key-content mark advantage

      The spec does, however, say that when a no-alt image has a non-empty title attribute, then the title represents the image's textual content. And the cited tests show the back-compatibility story of no-alt img elements with a non-empty title to be almost on par with img elements with a non-empty alt attribute present! The cited testing also showed that the discoverability of such images can be further improved by adding role=img to them (hint: VoiceOver, when images fail to load).

      Which is why this CP proposes the spec to include the title attribute in its microformat for no-alt img elements:

      1. When a title with the key-content mark is present, then ATs are triggered to announce the image using the AT’s built-in dummy text — and nothing more. Thus title plus key-content mark causes what the spec suggests to actually become a working reality.
      2. To use the title rather than the alt, means that we don’t interfere with the role of the alt attribute. And neither do we meddle with the AT or UA’s strategy for repairing for lack of alt.
        • For instance, the key-content mark ought to not interfere with how text browsers tend to render the file name whenever the alt is lacking.
        • And, in contrast to adding the key-content mark (aka ‘white-space characters’) in the alt attributite, adding in the title attribute ought not cause the image’s presence to become hidden the way it happens in textual browsers and GUI browsers if the alt attribute contains a space character.

      Thus, the main advantage of the markup generator advice of the key-content mark proposal is that it improves the way the spec’s current advice to drop the alt altogether — making it reliable for all (the tested) ATs.

      Important design details

      This CP takes into account several findings about AT and UAs and how they react to the contet of the title attribute:

      1. For alt and title, when their content is a single character, then VoiceOver — at least when used with Safari — supports a “microformat” where it, instead of “reading” the single (white-space) character (thus: representing it with silence), presents it with its Unicode name (e.g. it may say “zero width space” if the attribute’s sole content is a single ZERO WIDTH SPACE character). However, when there are more than a single white-space character present, then it “reads”/represents it normally.
        • It thus makes sense ot say that the key-content mark should consist of two white-space characters.
      2. Testing also showed that VoiceOver currently treat an image with a non-empty title as key content only if the element’s graphic did load and only if CSS generated content does not replace the graphical content. (This is evident from the above cited test page — but another, shorter test page perhaps makes it simpler to see.) Thus, if the graphic did not load or was replaced with non-graphic content via CSS, then VoiceOver fails to announce it as an image. However, if one adds role=img to the img, then VoiceOver will announce it as an image also when the image did not load/were replaced via CSS.
        • The CP thus recommends generators to add role=img as well. (I did consider a must, but some seems to be very opposed to requiring the role attribute. And also, strictly speaking, this is a VoiceOver bug. Plus that it might be relevant to use another role than img as well.)
      3. It could seem quite natural to simply define an instance of two SPACE characters (U+0020) as the key-content mark. However, this turns out to have a back-compatibility story without a happy end: When used together with Firefox, then NVDA and JAWS treat title="<SPACE>" as equal to title="<THE EMPTY STRING>". Meaning that the title presence would not have the desired effect of making them announce the presence of the image. Jaws does in fact have this problem even together with Internet Explorer. Finally, with the exception of Firefox (which never display a tooltip whenever it perceives the content to be white-space!), the SPACE character always results in a visible tooltip. (In fact, quite a few of the other of the “normal” white-space chars, like the no break space, have similar issues.)
        • The CP does thus not recommend to use the SPACE character as the key-content mark character.
      4. Finally one has to consider whether the whitespace in the title attribute results in a visible tooltip that designers might react against. Extensive testing showed the MONGOLIAN VOWEL SEPARATOR character (U+180E) to be one of the few (perhaps the only) character to cause an invisble or near invisible tooltip in 4 current browsers: Firefox, IE9, Chrome and Safari. Alas, in the prerelease of IE10, any character causes a visible tooltip. And also, in Opera the tooltip was noticable. And also, while a SPACE character (which cannot be used for other reasons – see above) would have caused Lynx to render the file name, the Mongolian vowel separator causes Lynx to use the title — thus, a space character — as fallback text, which is not quite ideal. Fortunately, the story is a little better in the text browser W3M, Links, Elinks and netrik. Thus, the file name is not used as image indicator in Lynx when this character is used as key-content mark. However, both Lynx — Opera and IE10 — should arguably improve their performance if they would implement the semantics of the Mongolian vowel separator. Thus this character ought to be a “future safe” choice. And at any rate the problem is fixable via JavaScript. For example, the following script empties the title attribute on hover, if and when it contains two Mongolian vowel separators (also known as the key-content mark):
        <script>
        function change_title(elem, attr) {
          var elems = document.getElementsByTagName(elem);
          for (var i = 0; i < elems.length; i++)
          elems[i][attr] = elems[i][attr].replace(/^\u180E\u180E$/, ''); } 
        window.onmouseover = function() {
          change_title('img', 'title');
        }
        </script>
        • The CP thus proposes two MONGOLIAN VOWEL SEPARATOR characters (U+180E) as the key-content mark.
      5. The Firefox behavior, where it doesn’t display a tooltip whenever it perceives the content to be solely white-space, seems worth keeping. If all UAs behaved that way, then we could eventually also pick a more easily typed key-content mark.
        • The CP does thus propose that the editors, if they agree, requires browsers to not display a tooltip whenever the content is a white-space.

      NB! In case for instance the Internationalization Working Group would identify problems with using the MONGOLIAN VOWEL SEPARATOR character (U+180E) for the key-content mark, then, as long as the problems with choosing another character are evaluated and as long as the positive effect on ATs is not lost, this CP is by no means locked to this particular character. And, thus, just for the record, on test page with some of the alternatives, there are some other “hot”, potential characters — none of which, however, seem to have quite as broad support as the proposed character.

      The alternative proposals

      The spec’s current solution

      As told: When alternative text for a key-part-of-the-content image is lacking, the spec recommends markup generators to “omit the alt attribute altogether”. If the a meta generator string in the head element is present as well, then the possible validation errors resulting from the omitted alt, are suppressed too. Which is a solution with several problems:

      1. It adds a meaning for the meta generator string that has nothing to do with the reasons for why generators inserts in the first place — the problems of which has caused ISSUE-206 to be reopened.
      2. It does not discern between images that are the sole content of a link and other key-part-of-the-content images. Thus, despite that the spec describes how to generate alternative text for sole-content-of-a-link images, validators are forbidden from reporting failure to do so as an error.
      3. And, most importantly: Because many ATs and UAs in fact treat such images like they treat images with the alt set to the empty string, the spec’s solution is currently almost equivalent to inserting an empty alt. Thus, it is a solution that does very little for endusers. This is the case for NVDA and JAWs. But also — if the image fails to load — for VoiceOver. Even Firefox has a tendency to hide such images whenever th image doesn‘t load. (This is due to its bugs related to the broken image icon, which tends to not render unless one adds a proprietary img:-moz-broken{} CSS rule for it.)

      Thus, the assumption of some, that there is a good back-compatibility story if the alt attribute is omitted, is a truth with many modificatiions.

      The relaxed/incomplete attribute proposal

      Two other change proposals suggest minting a new attribute — incomplete/relaxed – for such images. These proposals have a number of problems:

      1. They go against the established practice of using dummy strings —empty or non empty— to silence the validator.
      2. Their solution purely affects validation — and thus does not improve anyting for end users.
      3. They build heavily on the no alt pattern, the semantics of which is not very well understood: ATs and UAs do often treat such images as if the element had an empty alt. But it is also quite difficult to author with no-alt. For example, the developer of the BlueGriffon editor has said that he does not intend to allow users to enter no-alt — they must enter empt or non-empty alt.

      In a summary: These proposals’s independence from the meta generator string give them an important validation advantage. But they have no direct effect on endusers. And hence they are, in this detail, equal to the spec’s current microformat.

      Details

      In §4.8.1.1.12 "Guidance for markup generators," replace “or omit the alt attribute altogether, under the assumption that the image is a key part of the content” with the following:

      or, unless a title attribute already is present, add a title attribute with a key-content mark and omit the alt attribute altogether, under the assumption that the image is a key part of the content.

      NOTE: For improved compatibility, it is recommend to add a role attribute of value img as well.

      NOTE: The key-content mark is a white space character combination that has as its only task to cause user agents and assitive technologies to perceive a non-empty title attribute to be present, the reason being that some technologies are known to not present img elements whose alt attribute has been omitted to the user unless there is a non-empty title attribute as well. Currently, the key-content mark is considered to consist of two MONGOLIAN VOWEL SEPARATOR characters (U+180E) as this character in the majority of current user agents does not cause a visual title tooltip to be produced.

      In §4.8.1.1.13 "Guidance for conformance checkers," replace the second bullet (which starts with "The document has a meta element…") with the following text:

      Unless the image is the sole content of a link, then a missing alt attribute on an img element where there is a title attribute whose content is the key-content mark, shall not count as an error. However, except when other rules apply, then conformance checkers SHOULD make users aware that these elements are likely to be in lack of alternative text.

      Depending on the outcome of bug 18555

      • Add spec text to require a Firefox inspired behavior from all browsers: If white-space is the sole content of the title attribute, then the tooltip should be suppressed.

      Impact

      Positive Effects

      • We improve the situation a tiny bit for end users by making the no-alt situation work in more assistive technologies.
      • We no longer imbue <meta name=generator> with effects incompatible with its historical and deployed usage;
      • The rule applies also to non-generators. This is good because, just as there is no definition of handcoding, there is no definition of "generators". For example, if one uses an advanced text editor to create a large image site, then one may make heave use of a find and replace tool that bears strong resemblence of a generator. It is not intuitive that one shall have to add a meta generator before the key-content mark in the title attribute may make an img element valid.
      • We enable engineers of large Web applications to catch markup errors that they can do something about, without bothering them about markup errors they can't do anything about.
      • We build on the semantics of the attributes we have, instead of introducing new ones.
      • We build upon existing AT behavior: Screenreaders already treat images with a no alt as significant provided that there also is a non-empty title attribute.
      • We refine the img element's semantics.
      • A simpler rule for authors/developers: It is simpler to understand a rule which says that something good will happen if you add something — compared with a rule which says that something good happens if one omits something.
      • The solution may be perceived as a little bit complicated and involved givent that the character of the key-content mark is a seldomly used (outside the Mongolian context) used, non-spacing white-space character and also due the possible tooltip effects in some few browsers. But this can be a good thing as it increases the chance that it is done on purpose. Also confer the arguments in favor of a relaxed/incomplete attribute with an exceedingly long name just to make it inconvenient to use.
      • Except for when it contains the key-content mark, this CP does not touch the working group decision about the general non-conformance of img elements with no-alt but which has a non-empty title attribute instead. It seems defendable to deviate from that decision in this case since the title attribute with a key-content mark has a well defined semantics and good support (in fact, better support than what the spec and the alternative proposals have).
      • When present, then the authors do not need to think about it more: They can, later on, add an alt attribute, if they wish. Or they can add stuff to the title itself. It doesn’t disturb anyone.

      Negative Effects

      • For assistive technologi users: no known negative effects
      • For Firefox: no known negative tooltip effects (as the key-content mark causes no tool tip)
      • For IE9, Chrome and Safari: a negligible negative tooltip effect (in the form of an extremely small (Chrome/Safari/IE9) and hidden (IE9) tooltip).
      • For Opera and IE10: A visible tooltip, which authors may feel they need to fix (with JavaScript), which should be OK to do.
        • Given that the key-content mark (&#x180e;&#x180e;) is made up of invisible, non-spacing character, this has to be considered a bug.
      • For text browser users: When Lynx is set to render content as UTF-8, then it renders the MONGOLIAN VOWEL SEPARATOR as a space character rather than (as it does if the SPACE characer is used) falling back to the file name — which is slightly unexpected and confusing.
        • (Though, actually, for Lynx, the behavior depends on whether it is set to use the ISO-8859-1 encoding - in which case the Mongolian vowel separator does cause the file name to be rendered, or the UTF-8 encoding — in which case it currently renders as a space.)
      • For Webkit/Chromium browsers: These tend to render the title attribute as the alternative text whenever alt is omitted.
        • Thus they would render the key-content mark as the textual content of the image. On the other side: These browsers also renders a broken image icon, thus sighted users will nevertheless see the image.
        • Thus, if the title had been omitted, Webkit/Chromium users would have seen the very same thing. Which means that this CP should not affect Webkit/Chromium users.
      • If typed directly, then the key-content mark is, for the eye, inseparable from the empty string.
        • Authors should use insert it using a numeric character reference: &#x180e;&#x180e;
      • This CP does not solve the following use case: <img title="Advicory text" src="i">. That is: For the use case of a no-alt img element with a title attribute whose content is something other than the key-content mark, then such elements remain invalid (unless other rules, such as presence inside the figure element, make them valid).
        • This disadvantage cannot be said to important. (Further more, if validity is a concern, a markup generator could in such a case duplicate the title content in the alt.)

      Conformance Classes Changes

      • This change alters the following conformance classes:
        • Conforming documents,
        • Conformance checkers, and
        • Authoring tools and markup checkers.
        • Assistive Technology.

      Risks

      • That authors finds the resulting tooltips intolerable.

      References

      Contributors

      • Leif Halvard Silli,
        with thanks for the input on my first CP, especialy from John Foliot and Benjamin Hawkes-Lewis.