Warning:
This wiki has been archived and is now read-only.

ProposedElements/Initialisms

From XHTML2
Jump to: navigation, search

Issue: XHTML2 and Initialism Elements

The issues addressed below are issues for a wide variety of users, and have myriad implications, as has been repeatedly noted, for internationalization as well as accessibility, not to mention general usability.

Prelimenary Thoughts on Abbreviations and Initialisms

The main point in any discussion about abbreviation markup is that it provides the user with a level of granularity that makes documents more accessible to all users; the user may choose to ignore them, expand them automatically or on demand, but no matter how the abbreviation markup is ultimately rendered client-side, there are certain attributes which not only enhance human understanding, but enable machine differentiation between types of abbreviations and their individual characteristics, regardless of rendering or implementation. Moreover, a more robust for/id association mechanism would allow authors to reuse expansions by pointing to an initial expansion -- or, preferably, a site-wide dictionary of abbreviations and their acronyms:

<link type="application/rdf+xml" rel="expansion" href="expansions.rdf" />

or

<link type="application/xhtml+xml" rel="glossary" href="../../glossary.xhtml" />

thus making it far easier for authors to implement, especially if they can do it once and forget about it, until prompted by an ATAG compliant authoring tool to add an association between text contained in an abbreviated element, and the site's global expansions list, which -- ultimately -- will lead to their wider use, as has been the case with a universal or multiple stylesheets associated with a document instance with the LINK element.


Discussion and Proposals

POINT 1: Abbreviations are Abbreviations are Abbreviations

St. versus St. is the classic example in English, as is "Dr." -- the abbreviations for both the words Doctor and Drive.

Another obvious example is the French abbreviation for Mademoiselle, but for those who know French primarily as a spoken language, the abbreviation conventions of French may cause confusion and comprehension problems, due to the fact that those conventions familiar to the end user differ greatly from those used in a document instance. Additionally, the abbreviation "Mlle." sounds like "mwlee" when pronounced using a text-to-speech engine that doesn't support natural language switching on the fly -- or, far more often, due to the author's failure to supply a lang attribute which would trigger automatic natural language switching on the fly by a text-to-speech engine, translation software, or for the purposes of rendering a "non-standard" (from the end user's point of view) set of glyphs:

<abbr xml:lang="fr" title="Mademoiselle">Mlle.</abbr>

CONCLUSION 1: Abbreviations are therefore needed in XHTML2.


POINT 2: Initialisms are Initialisms are Initialisms

There is a demonstrable need for an IABBR element, which would subsume the ACRONYM element of HTML 4.01. Note that the original proposed name for this element was INIT, but it has been suggested that INIT is too ambiguous and might be confused with other uses of the term.

No matter the rules governing the natural language expression of an initialism they can be sub-categorized by the following REQUIRED attributes; additions from anyone with a wider knowledge of non-Western European languages, are strongly encouraged:

  • type="acronym"
  • type="initialism"
  • type="camelcase-abbr"
  • type="alpha-numeric"

(Note: The neccesity of type="alpha-numeric" will be discussed later in this document)

IABBR would also require an expressed-as attribute, as illustrated in the following example:

  • expressed-as="characters" (originally, expressed-as="letters")
  • expressed-as="word"
  • expressed-as="phrase"


IABBR EXAMPLES

IABBR would thus result - in its rudest form - in code such as:

<iabbr type="acronym" expressed-as="word"
title="Visually Impaired Computer Users' Group">VICUG</iabbr>

or

<iabbr type="camelcase-abbr" expressed-as="word"
title="SOund Navigation And Ranging">SONAR</iabbr>

or

<iabbr type="camelcase-abbr" title="HyperText Markup Language"
expressed-as="characters">HTML</iabbr>

or

<iabbr type="initialism" expressed-as="characters"
title="National Association for the Advancement of Colored Persons"
>NAACP</iabbr>

W3C would most likely fall under the "camelcase-abbr" typology, but it does beg the question: is there a need for a "alpha-numeric" type, or does changing the attribute name "letters" to "characters" cover such alpha-numeric initialisms as illustrated by the following example:

<iabbr type="alpha-numeric" expressed-as="characters"
title="World Wide Web Consortium">W3C</iabbr>

or

<iabbr type="alpha-numeric" expressed-as="characters"
title="The Minnesota Mining and Manufacturing Company"
>3M</iabbr>

On the other hand, the question remains, what to do with such antiquated initialisms such as WWW - would one want that expressed as letters or as reflective of the value defined for the title, "World Wide Web"? Does this necessitate another value for the expressed-as attribute, namely, phrase?

<iabbr type="initialism" expressed-as="phrase"
title="World Wide Web">WWW</iabbr>

Open Questions:

  1. is "phrase" a synonym for "title", which is what one wants expressed in a case such as WWW, as discussed below; if so, why not just use the value "title" for "phrase" when coding the "expressed-as" attribute?
  2. originally "expressed" was "pronounced", but there was discussion within the Protocols & Formats Working Group which discussed the use of adding qname or another analogous, workable solution so as to provide real, robust pronunciation guidance within the iabbr element;
  3. is there a need for type="camelcase" and type="camelcase-abbr"? Is SONAR a contraction of words that comprise a new single word formed of a camelcased phrase, or merely an abbreviation for "SOund Navigation And Ranging"?

Conclusions for Point 2

In summation, there would be an element, iabbr which would include all known permutations of what we have, up until now, referred to as being subject to the ACRONYM element, which would contain REQUIRED attributes, "type", "expressed-as", and "title", to semantically distinguish the type of initialism being expanded, notated, and/or pronounced/displayed.


POINT 3. Building More Robust for/id Associations for Abbreviation Elements

No matter what form abbreviation and/or initiallism elements take in canonical HTML, single or multiple abbreviation markup needs a strong and elastic for/id binding mechanism for re-useability's (and the author's sanity's) sake.

The simplest means of strengthening the ABBR element is to use the for/id model to associate repeated instances of an abbreviated form, by marking the first instance with the explicit expansion, using the title attribute, as well as a unique identifier, provided by the id attribute. Subsequent repietions of an abbreviated form element would allow an author or authoring tool to use the for attribute to point at the initial expansion for that abbreviated form, as in the following example:

<p>
<abbr id="a1" title="Doctor">Dr.</abbr> Suess
wrote children's books.  He lived on Suess
<abbr id="a2" title="Street">St.</abbr>, which
had been renamed in his honor; its previous name
being <abbr for="a1">Dr.</abbr> Doolittle <abbr
id="a3" title="Drive">Dr.</abbr>
</p>

<p>
Suess <abbr for="a2">St.</abbr> should not be
confused with Suess <abbr for="a3">Dr.</abbr>,
formerly <abbr id="a4" title="Saint">St.</abbr>
Patrick's <abbr id="a5" title="Place">Pl.</abbr>,
which is the site of <abbr for="a4">St.</abbr>
Harold's Methodist Church, whose pastor is the
<abbr title="Reverend" id="a6">Rev.</abbr>
<abbr for="a1">Dr.</abbr> Paul Bunyon, author
of <cite>This Pilgrim's Progress</cite>.
</p>

A similar for/id binding should also be part of the IABBR element, so as to make sense of an article whose topic sentence is:

  The ADA has released an ADA-compliance recommendation for dentists and their patients with AIDS; a recommendation that grew out of the work of the AIDS' sub-committee on safety.

in which the first instance of ADA equals "The American Dental Association", the second, "The Americans with Disabilities Act"; whilst the first instance of AIDS expands to "Acquired Immunodeficiency Syndrome" (or, if you prefer, "Acquired immune deficiency syndrome"), whilst the second use of the initialism AIDS was to represent the "Association of Independent Dental Surgeons"

Through a robust and elastic definition of the for/id mechanism to provide bindings between the abbreviated text and its gloss, an expansion associated with a particular abbreviation can not only be reused, but provide a means of clarification/differentiation in the case of homonymic (identically spelt or pronounced) abbreviations. it would also facilitate a site-wide means of associating unique abbreviations with their expansion, building upon the example of using LINK to point to an RDF assertion document, containing explicit bindings between expansions and the abbreviations for which they stand, thereby allowing an author to define an abbreviation once and reuse the content of the for attribute to provide expansions which could then be easily applied site-wide. And, since the assumption seems to be that the ideal model is to provide authors with a way of constructing semantically sensible markup to contain their content, it would translate into a simple interface in an authoring tool - every time ABBR is invoked for a string of text, the author could be prompted to reuse a previously defined expansion, or provide a unique expansion, which would then be appended to the site-wide expansion resource.


Alternate Proposal: A Single Abbreviation Element for XHTML2

For simplicity's sake, and to ease the burden on authors as to whether an abbreviated form is an abbreviation or initialism, it has been suggested that a single abbreviated form element, with the addition of the "type" attributes defined above for IABBR, be included in XHTML2.

  • type="abbreviation" (this could be the default or implied value for ABBR, thereby obviating the need for a type="abbreviation" or type="abbr" or type="short")
  • type="acronym"
  • type="initialism"
  • type="camelcase-abbr"
  • type="alpha-numeric"

To address internationalization as well as accessibility concerns, a generalized ABBR element will also require an "expressed-as" attribute, as illustrated in the following example:

  • expressed-as="characters" (originally, expressed-as="letters")
  • expressed-as="word"
  • expressed-as="phrase"

No matter what form abbreviation and/or initialism elements take in HTML5, single or multiple abbreviation markup needs a strong and elastic for/id binding mechanism for re-useability's (and the author's sanity's) sake.

The simplest means of strengthening the ABBR element is to use the for/id model to associate repeated instances of an ABBR, by marking the first instance with the explicit expansion, using the title attribute, as well as a unique identifier, provided by the id attribute. Subsequent repetions of an ABBR thus defined, would allow an author or authoring tool to use the for attribute to point at the initial expansion for that ABBR.

Through a robust and elastic definition of the for/id mechanism to provide bindings between the abbreviated text and its gloss, an expansion associated with a particular abbreviation can not only be reused, but provide a means of clarification/differentiation in the case of homonymic (identically spelt or pronounced) abbreviations. It would also facilitate a site-wide means of associating unique abbreviations with their expansion, building upon the example of using LINK to point to an RDF assertion document, containing explicit bindings between expansions and the abbreviations for which they stand, thereby allowing an author to define an abbreviation once and reuse the content of the for attribute to provide expansions site-wide.

<link type="application/rdf+xml" rel="expansion" href="expansions.rdf" />

or

<link type="application/xhtml+xml" rel="glossary" href="../../glossary.xhtml" />

Since the ideal model is to provide authors with a way of constructing semantically sensible markup to contain their content, it would translate into a simple interface in an authoring tool: every time ABBR or its equivalents or parrallels, is invoked for a string of text, the author could be prompted to reuse a previously defined expansion, or provide a unique expansion, which would then be appended to the site-wide expansion resource.

Related Resources

added 15 December 2008 by Gregory J. Rosmaita