AbbrAndInitialisms

From HTML WG Wiki
Jump to: navigation, search


Issue: HTML Needs Initialism Elements

The issues addressed below are issues for a wide variety of users, and have myriad implications, as has been repeatedly noted, for internationalization as well as accessibility, not to mention general usability.


Prelimenary Thoughts on Abbreviations and Initialisms

The main point in any discussion about abbreviation markup is that it provides the user with a level of granularity that makes documents more accessible to all</b> users; the user may choose to ignore them, expand them automatically or on demand, but no matter how the abbreviation markup is ultimately rendered client-side, there are certain attributes which not only enhance human understanding, but enable machine differentiation between types of abbreviations and their individual characteristics, regardless of rendering or implementation. Moreover, a more robust for/id association mechanism would allow authors to reuse expansions by pointing to an initial expansion -- or, preferably, a site-wide dictionary of abbreviations and their acronyms:

 
<link type="application/rdf+xml" rel="expansion" href="expansions.rdf" />
 

thus making it far easier for authors to implement, especially if they can do it once and forget about it, until prompted by an ATAG compliant authoring tool to add an association between text contained in an abbreviated element, and the site's global expansions list, which -- ultimately -- will lead to their wider use, as has been the case with a universal or multiple stylesheets associated with a document instance with the LINK element.


Proposals

POINT 1: Abbreviations are Abbreviations are Abbreviations:

 
<abbr title="Street">St.</abbr>
versus
<abbr title="Saint">St.</abbr>
 

is the classic example in English, as is "Dr." -- the abbreviations for both the words Doctor and Drive.

Another obvious example is the French abbreviation for Mademoiselle, but for those who know French primarily as a spoken language, the abbreviation conventions of French may cause confusion and comprehension problems, due to the fact that those conventions familiar to the end user differ greatly from those used in a document instance. Additionally, the abbreviation "Mlle." sounds like "mwlee" when pronounced using a text-to-speech engine that doesn't support natural language switching on the fly -- or, far more often, due to the author's failure to supply a lang attribute which would trigger automatic natural language switching on the fly by a text-to-speech engine, translation software, or for the purposes of rendering a "non-standard" (from the end user's point of view) set of glyphs:

 
<abbr lang="fr" title="Mademoiselle">Mlle.</abbr>
 

<b>CONCLUSION 1: Abbreviations are therefore needed in canonical HTML.

POINT 2: Initialisms are Initialisms are Initialisms

There is a demonstrable need for an IABBR element, which would subsume the ACRONYM element of HTML 4.01

No matter the rules governing the natural language expression of an initialism they can be sub-categorized by the following REQUIRED attributes; additions from anyone with a wider knowledge of non-Western European languages, are strongly encouraged:

  • type="acronym"
  • type="initialism"
  • type="camelcase-abbr"
  • type="alpha-numeric"

(Note: The neccesity of type="alpha-numeric" will be discussed later in this document)

IABBR would also require an "expressed-as" attribute, as illustrated in the following example:

  • expressed-as="characters" (originally, expressed-as="letters")
  • expressed-as="word"
  • expressed-as="phrase"

IABBR EXAMPLES:

IABBR would thus result - in its rudest form - in code such as:

 
<IABBR type="acronym" expressed-as="word"
title="Visually Impaired Computer Users' Group"
>VICUG</IABBR>
  

or

 
<IABBR type="camelcase-abbr" expressed-as="word"
title="SOund Navigation And Ranging">SONAR</IABBR>
 

or

 
<IABBR type="camelcase-abbr" title="HyperText Markup Language"
expressed-as="characters">HTML</abbr>
 

or

 
<IABBR type="initialism" expressed-as="characters"
title="National Association for the Advancement of Colored Persons"
>NAACP</IABBR>
 

W3C would most likely fall under the "camelcase-abbr" typology, but it does beg the question: is there a need for a "alpha-numeric" type, or does changing the attribute name "letters" to "characters" cover such alpha-numeric initialisms as illustrated by the following example:

 
<IABBR type="alpha-numeric" expressed-as="characters"
title="World Wide Web Consortium">W3C</IABBR>
 

or

 
<IABBR type="alpha-numeric" expressed-as="characters"
title="The Minnesota Mining and Manufacturing Company"
>3M</IABBR>
 

On the other hand, the question remains, what to do with such antiquated initialisms such as WWW - would one want that expressed as letters or as reflective of the title, World Wide Web? Does this necessitate another value for the "expressed-as" attribute, namely, phrase?

 
<IABBR type="initialism" expressed-as="phrase"
title="World Wide Web">WWW</IABBR>
 

Open Questions:

  1. is "phrase" a synonym for "title", which is what one wants expressed in a case such as WWW, as discussed below; if so, why not just use the value "title" for "phrase" when coding the "expressed-as" attribute?
  2. originally "expressed" was "pronounced", but there was discussion within the Protocols & Formats Working Group which discussed the use of adding qname or another analogous, workable solution so as to provide real, robust pronunciation guidance within the IABBR element;
  3. is there a need for type="camelcase" and</b> type="camelcase-abbr"? Is SONAR a contraction of words that comprise a new single word formed of a camelcased phrase, or merely an abbreviation for "SOund Navigation And Ranging"?

<b>Conclusions for Point 2

In summation, there would be an element IABBR which would include all known permutations of what we have, up until now, referred to as being subject to the ACRONYM element, which would contain REQUIRED attributes, "type", "expressed-as", and "title", to semantically distinguish the type of initialism being expanded, notated, and/or pronounced/displayed.


POINT 3. Building More Robust for/id Associations for Abbreviation Elements

No matter what form abbreviation and/or initiallism elements take in canonical HTML, single or multiple abbreviation markup needs a strong and elastic for/id binding mechanism for re-useability's (and the author's sanity's) sake.

The simplest means of strengthening the ABBR element is to use the for/id model to associate repeated instances of an ABBR, by marking the first instance with the explicit expansion, using the title attribute, as well as a unique identifier, provided by the id attribute. Subsequent repietions of an ABBR thus defined, would allow an author or authoring tool to use the for attribute to point at the initial expansion for that ABBR, as in the following example:

 
<p>
<ABBR id="a1" title="Doctor">Dr.</ABBR> Suess
wrote children's books.  He lived on Suess
<ABBR id="a2" title="Street">St.</ABBR>, which
had been renamed in his honor; its previous name
being <ABBR for="a1">Dr.</ABBR> Doolittle <ABBR
id="a3" title="Drive">Dr.</ABBR>
</p>

<p>
Suess <ABBR for="a2">St.</ABBR> should not be
confused with Suess <ABBR for="a3">Dr.</ABBR>,
formerly <ABBR id="a4" title="Saint">St.</ABBR>
Patrick's <ABBR id="a5" title="Place">Pl.</ABBR>,
which is the site of <ABBR for="a4">St.</ABBR>
Harold's Methodist Church, whose pastor is the
<ABBR title="Reverend" id="a6">Rev.</ABBR>
<ABBR for="a1">Dr.</ABBR> Paul Bunyon, author
of <CITE>This Pilgrim's Progress</CITE>.
</p>
 

A similar for/id binding should also be part of the IABBR element, so as to make sense of an article whose topic sentence is:

  The ADA has released an ADA-compliance recommendation for dentists and their patients with AIDS; a recommendation that grew out of the work of the AIDS' sub-committee on safety.

in which the first instance of ADA equals "The American Dental Association", the second, "The Americans with Disabilities Act"; whilst the first instance of AIDS expands to "Acquired Immunodeficiency Syndrome" (or, if you prefer, "Acquired immune deficiency syndrome"), whilst the second use of the initialism AIDS was to represent the "Association of Independent Dental Surgeons"

Through a robust and elastic definition of the for/id mechanism to provide bindings between the abbreviated text and its gloss, an expansion associated with a particular abbreviation can not only be reused, but provide a means of clarification/differentiation in the case of homonymic (identically spelt or pronounced) abbreviations. it would also facilitate a site-wide means of associating unique abbreviations with their expansion, building upon the example of using LINK to point to an RDF assertion document, containing explicit bindings between expansions and the abbreviations for which they stand, thereby allowing an author to define an abbreviation once and reuse the content of the for attribute to provide expansions which could then be easily applied site-wide. And, since the assumption seems to be that the ideal model is to provide authors with a way of constructing semantically sensible markup to contain their content, it would translate into a simple interface in an authoring tool - every time ABBR is invoked for a string of text, the author could be prompted to reuse a previously defined expansion, or provide a unique expansion, which would then be appended to the site-wide expansion resource.


Alternate Proposal: A Single Abbreviation Element for HTML5

For simplicity's sake, and to ease the burden on authors as to whether an abbreviated form is an abbreviation or initialism, it has been suggested that a single</b> abbreviated form element, with the addition of the "type" attributes defined above for #head-f8059d3b02de8226db5def28fc6b69a52939d034 IABBR, be included in HTML5.

  • type="abbreviation" (this could be the default or implied value for ABBR, thereby obviating the need for a type="abbreviation" or type="abbr" or type="short")
  • type="acronym"
  • type="initialism"
  • type="camelcase-abbr"
  • type="alpha-numeric"

To address internationalization as well as accessibility concerns, a generalized ABBR element will also require an "expressed-as" attribute, as illustrated in the following example:

  • expressed-as="characters" (originally, expressed-as="letters")
  • expressed-as="word"
  • expressed-as="phrase"

No matter what form abbreviation and/or initialism elements take in HTML5, single or multiple abbreviation markup needs a strong and elastic for/id binding mechanism for re-useability's (and the author's sanity's) sake.

The simplest means of strengthening the ABBR element is to use the for/id model to associate repeated instances of an ABBR, by marking the first instance with the explicit expansion, using the title attribute, as well as a unique identifier, provided by the id attribute. Subsequent repetions of an ABBR thus defined, would allow an author or authoring tool to use the for attribute to point at the initial expansion for that ABBR.

Through a robust and elastic definition of the for/id mechanism to provide bindings between the abbreviated text and its gloss, an expansion associated with a particular abbreviation can not only be reused, but provide a means of clarification/differentiation in the case of homonymic (identically spelt or pronounced) abbreviations. It would also facilitate a site-wide means of associating unique abbreviations with their expansion, building upon the example of using LINK to point to an RDF assertion document, containing explicit bindings between expansions and the abbreviations for which they stand, thereby allowing an author to define an abbreviation once and reuse the content of the for attribute to provide expansions which could then be easily applied site-wide. And, since the assumption seems to be that the ideal model is to provide authors with a way of constructing semantically sensible markup to contain their content, it would translate into a simple interface in an authoring tool: every time ABBR is invoked for a string of text, the author could be prompted to reuse a previously defined expansion, or provide a unique expansion, which would then be appended to the site-wide expansion resource.

Relation to DFN and VAR elements

[[DefiningTermsEtc: DefiningTermsEtc]] is another proposal closely related to the present one. There the proposal recommends treating DFN and ABBR along with VAR in parallel. By establishing new elements for DEFINE and BLOCKDEFINE. These two elements provide a container for defining a term or defining the expansion of an abbreviation. Rather than relying on the all-purpose title attribute, the contents of these elements provide the expansion of the abbreviated form or term.

One advantage to this is that the abbreviations can be expanded with minimal markup. No id attribute (nor any other attribute) is required on each ABBR element. All of the expansion and pronunciation information are provided in the DEFINE and BLOCKDEFINE elements which could be included in a separate site-wide or community-wide file.

 
<p>
<DEFINE abbr='Dr.' content='Doctor' ></DEFINE>
<DEFINE abbr='Dr.2' content='Drive' ></DEFINE>
<DEFINE abbr='St.' content='Street' ></DEFINE>
<DEFINE abbr='St.2' content='Saint' ></DEFINE>
<ABBR>Dr.</ABBR> Suess
wrote children's books.  He lived on Suess
<ABBR>St.</ABBR>, which
had been renamed in his honor; its previous name
being <ABBR for="a1">Dr.</ABBR> Doolittle <ABBR
"variantof='Dr.2' >Dr.</ABBR></p>

<p>Suess <ABBR>St.</ABBR> should not be
confused with Suess <ABBR variantof='Dr.2' >Dr.</ABBR>,
formerly <ABBR variantof='St.2' >St.</ABBR>
Patrick's <ABBR define='Place' >Pl.</ABBR>,
which is the site of <ABBR variantof='St.2' >St.</ABBR>
Harold's Methodist Church, whose pastor is the
<ABBR define="Reverend" >Rev.</ABBR>
<ABBR>Dr.</ABBR> Paul Bunyon, author
of <CITE>This Pilgrim's Progress</CITE>.</p>
 

Other attributes are directed towards solving the same problems as those proposed here, just with different names and values. In this approach the relative ease of editing is proportionally increased the more the author reuses the same abbreviation expansions.

The same abbreviations expansions could be defined — along with other definitions of TERM, VAR and PROPERN (proper nouns) elements in a separate glossary document in a similar manner as discussed above:

 
<link type="application/xhtml+xml" rel="glossary" href="../../glossary.xhtml" />
 
  <section>
     <h>Dr.<h>
        <define abbr='Dr.' content='Doctor' />
        <define abbr='Dr.2' content='Drive' />
  </section>
  <section>
      <h>St.<h>
        <define abbr='St.' content='Street' />
        <define abbr='St.2' content='Saint' />
  </section>

Related Wiki Pages/Issues


Email