This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 17907 - Need attribute to abbr to specify semantic type of abbreviation
Summary: Need attribute to abbr to specify semantic type of abbreviation
Status: RESOLVED WONTFIX
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-18 07:15 UTC by contributor
Modified: 2012-08-25 16:43 UTC (History)
5 users (show)

See Also:


Attachments

Description contributor 2012-07-18 07:15:37 UTC
This was was cloned from bug 15723 as part of operation convergence.
Originally filed: 2012-01-26 06:11:00 +0000

================================================================================
 #0   contributor@whatwg.org                          2012-01-26 06:11:53 +0000 
--------------------------------------------------------------------------------
Specification: http://dev.w3.org/html5/spec/Overview.html
Multipage: http://www.whatwg.org/C#top
Complete: http://www.whatwg.org/c#top

Comment:
There needs to be a way to indicate what type of abbreviation is wrapped in an
abbr element.

The problem I am trying to solve: Proper indication to screen readers of the
type of abbreviation so that the screen reader does not have to guess at what
it should do.

Initialisms should have each letter read, acronyms should have the
abbreviation read as a word, shorthand should have the contents of the title
attribute read.

Take the following example:

<abbr title="HyperText Markup Language">HTML</abbr>

In that example, the title should only be read if the user has asked titles be
read.

<abbr title="Kentucky">KY</abbr>

In that example, a screen reader probably should replace KY with the contents
of the title regardless of whether or not the use has asked for titles to be
read. However, if that KY is in a postal address, then it probably should be
treated as an initialism and have the letters read but not the title.

What I suggest is that the <abbr /> element have an optional type attribute.

type="initialism" - Screen readers SHOULD read the contents one letter at a
time UNLESS the user has a preference to have the title read.

type="acronym" - Screen readers SHOULD read the contents as a word UNLESS the
user has a preference to have the title read.

type="title" - Screen readers SHOULD read the contents of the title attribute
if it is present

When no "type" attribute is set, the screen readers are free to use whatever
logic they want to apply but SHOULD NOT read the title attribute UNLESS the
user has a preference to have the title read. MathML would probably be a good
example there, it's an initialism mixed with a word, but the way it is spelled
with mixed case should make it easy for a screen reader to figure out.

-- Michael A. Peters <mpeters@domblogger.net> and Alice Wonder
<awonder@domblogger.net>

Posted from: 71.84.0.205
User agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.75 Safari/535.7
================================================================================
 #1   Michael A. Peters                               2012-01-26 06:38:48 +0000 
--------------------------------------------------------------------------------
I would like to note that this does NOT belong in an aural style sheet.

The problem is an HTML5 article can be removed from the page where it first appears and embedded elsewhere, such as syndication. The only way to bring the proper aural rendering of the abbreviations along with the article would be to define them in a style node within the article, but then you no longer have presentation separate from markup.
================================================================================
 #2   theimp@iinet.net.au                             2012-07-05 02:47:47 +0000 
--------------------------------------------------------------------------------
The problem is, this information is the opinion of the author, whereas pronunciation should be in the opinion of the user. Also, this model is insufficient to handle a large number of use cases.

Consider:

<abbr type="???" title="Joint Photographics Expert Group">JPEG</abbr>

What type value do I use so that this is pronounced the way that by far most people pronounce it: JAY-peg?

Or by far the next most common pronunciation, JAY-PEE-GEE (omitting the EE even when it's written in, if they are familiar with the meaning, as most people who pronounce it in this way do so due to exposure to the file extension ".jpg")?

This does not even begin to address syllable-based abbreviations, which are very much more common in other languages (German, Japanese).

Fundamentally, (abstract) pronunciation is "presentation", not content. @abbr is intended to encompass the semantic fact that a term is abbreviated, not specify how to render it in speech specifically.

Possible solutions at this time and in the near future include stylesheets and metadata. In any case, this is probably something that is beyond the scope of HTML.

As for separating style from markup, this is a case where the usual logic works differently. Logically, you don't style content as such; you style the Structure (ie. <span> or whatever). In this case, you'd actually be styling the Content itself, and that's not a problem.

<span style="color:blue;">HTML</span>
<!-- What if I want the color to be Green? Presentation should be separated from Structure. -->

<abbr title="Hypertext Markup Language" style="speak-as:spell-out;">HTML</abbr>
<!-- Why would I ever change the pronunciation? The "Presentation" is intrinsically linked to the Content in this case. -->
================================================================================
Comment 1 Ian 'Hixie' Hickson 2012-08-25 16:43:00 UTC
Turns out that in practice this is mostly handled by dictionaries, and there ends up not being a need for there to be an explicit type on acronyms. For the cases where there is a need, I recommend using the speech CSS controls as you mention. (Note that pronunciations vary. e.g. "SQL" is pronounced as "sequel" by some people, and "ess queue elle" by others.)