From HTML WG Wiki
Jump to: navigation, search

HTML Issue: Abbr and Acronym


There has been a discussion of:

Points of Attention

  • The main problem with having only one tag concerns aural UAs: acronyms are pronounced as words (that is, identical to unmarked-up text), many abbreviations are spelled out, yet others are a hybrid form, which are pronounced partly like a word and partly spelled. There is also the class of abbreviations that are spoken in full but written shorthand, such as "etc."
  • In all of the above cases, there may be user preferences for how a word is to be spoken, and these may not be the same for all speakers of a given language or dialect.
  • Most, if not all speech synthesizers come equipped with a dictionary containing common abbreviations allowing them to read plain text correctly. If the only reason these words are marked up is to benefit aural UAs, does it really provide any benefit to, for example, write etc. ?

Further Points of Attention: Abbreviations Aren't Just for Speech Users

  1. What is needed is not only the expansion, but an aural styling rule, using the "speak:" property outlined in Section 19 of CSS2, so that abbreviations and/or acronyms can be marked to either be pronounced as words or spelled-out
    • A) Examples:
      • GUI
      • HTML
 B. Note that the CSS3 Speech Module contains a mechanism for phonetic spellings (so that, for example, "GUI" sounds like "gooey", instead of "guy") and much else of utility for aural renderers. Note, as well, that the self-voicing extention to FireFox, FireVox, is implementing the CSS3 Speech Module.
  1. the second bulletted Point of Attention only strengthens the argument for mandatory declaration of a primary natural language for the entire document; if a speech synthesizer recognizes a switch in natural language, it should switch to that language (if supported), and apply the natural language rules for the author-defined natural language. Likewise, a visual renderer can, in response to a natural language declaration, switch character sets, fonts, end even directionality, to display the natural language correctly. This is equally important to users of refreshable braille displays -- there is no universally-agreed-upon international braille code, so most languages use language specific braille rules to accomodate accented characters, as well as to provide "Grade 2" braille (in which one can use contractions and word symbols, which vary from language to language), as well as generating preceding characters, such as a dot 6 preceding a letter to indicate it is capitalized, or 2 preceding dot 6s to indicate that the entire word is capitalized) to the end-user.
  2. it is not only for speech-synthesis that an expansion for such "common" abbreviations as etc. (e t c period), ibid, or e.g. -- for those for whom the "common" abbreviations have nothing in common with their first, second or tertiary natural language, or for those who have never encountered Latinisms in their study of English. To mark up "etc." for example, one wouldn't include the lang="la" attribute, as etc. is an accepted part of the natural language "en"; moreover expanding "etc." to "et cetera" doesn't really help -- the correct expansion for etc. is "and so on", for ibid "in the same place", for i.e., "that is", for e.g. "for example", and so on; don't forget that there will be plenty of MouseOvers by non-native speakers of a declared natural language when a user encounters unfamiliar or synonymic abbreviations, in which case the expansion would be rendered in a ToolTip. Thus, this isn't just an aural rendering mechanism, but an overall contextualizing mechanism,

Problem linked to

  • How can we treat the 'word art' question [1] ?


This issue was raised in:

A very complete example has been given by Gregory J. Rosmaita in [1]

[1] alternative guides to content (semantic, visual, but not pronunciation) — Dailey, David P, 25 Mar 2007


  • No resolution yet.