This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 12985 - HTML 5 and IPA TTS
Summary: HTML 5 and IPA TTS
Status: CLOSED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: LC1 HTML5 spec (show other bugs)
Version: unspecified
Hardware: Other All
: P3 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords: a11y
Depends on:
Blocks:
 
Reported: 2011-06-18 05:53 UTC by HTML WG bugbot
Modified: 2012-12-19 14:58 UTC (History)
10 users (show)

See Also:


Attachments

Description HTML WG bugbot 2011-06-18 05:53:09 UTC
public-html-comments posting from: "Michael A. Peters" <mpeters@shastaherps.org>
http://www.w3.org/mid/4DF0CF96.9090606@shastaherps.org

Hello,

I hope this is the right list, I admit I did not read every list 
description on the daunting number.

I am using HTML5 even though it currently is not official.

Most of my content is standard English and probably is readable by 
various plugins that read web content. Some of my content includes 
taxonomic names, and some proper nouns that may not be properly read (IE 
Ft. Reading should be pronounced as if it says Ft. Redding).

One way to solve this would be to indicate pronunciation in the an HTML 
attribute - IE:

<p>... They are similar in appearance to California Red-legged Frog 
(<span class="taxon" data-ipa="Ë/rÉ nÉ/ /dreɪ ËtoÊ ni/">Rana 
draytonii</span>) tadpoles but lack ... </p>

I'm currently using the custom attribute data-ipa for that.

If there is an "official" way, I did not find it. If there is not an 
official way, I think there should be.

If I ever provide a tts service for my site myself, using a custom 
attribute will suffice as I could code the translation to SSML or 
whatever, but current FOSS tts (IE festival) does not currently handle 
IPA and the commercial solutions that do handle IPA are way way way out 
of my budget.

I'd like to at least provide something to give web browsers with a tts 
plugin a hint as to how to read it, and to do that, there needs to be a 
standard attribute.

Of course an attribute that specifies IPA pronunciation would only work 
in UTF8 documents, but that's not a problem for most.

Thank you for any suggestions.
Comment 1 Tab Atkins Jr. 2011-06-18 17:14:53 UTC
You say that you're currently using a data-ipa attribute, but also that you don't currently offer any direct TTS functionality on your website.

What are you using the data-ipa attribute *for*, then?  If it's currently not being used for anything, and you're simply suggesting that it might be a good idea for some future time when you might add something that takes advantage of it, then it's probably not a good idea to add anything to the language yet.  We don't know what the best approach is, because nobody's experimented with things yet and discovered what works and what doesn't.
Comment 2 Jukka K. Korpela 2011-06-28 04:29:36 UTC
(In reply to comment #1)

> What are you using the data-ipa attribute *for*, then?  If it's currently not
> being used for anything, and you're simply suggesting that it might be a good
> idea for some future time when you might add something that takes advantage of
> it, then it's probably not a good idea to add anything to the language yet.

Describing pronunciation information is as such declarative, and it has the obvious potential use that speech-based user agents could use the information. From the perspective of authoring and markup language, this is sufficient for asking for a _language-defined_ method for providing the information.

> We
> don't know what the best approach is, because nobody's experimented with things
> yet and discovered what works and what doesn't.

The problem here is that there is considerable confusion even regarding the nature of pronunciation information - does it belong to content and markup, or to style sheets? Is the information to be provided for a few individual occurrences of words, or for words in general in a vocabulary-like manner? And so on.

There is a discussion of approaches pronunciation information at
http://www.w3.org/html/wg/wiki/PronunciationSemantics
marked as related to ISSUE-49 (of something) and as CLOSED in 2008.

Despite the complexity of the topic, there is a fairly simple (and even obvious) possible approach: add an attribute, applicable to all elements, defined to contain the IPA presentation of the intended pronunciation of the textual content. If this becomes used by authors and utilized by browsing and other software, then new complexity (like allowing other phonetic notation systems as well, or specifying a way to indicate that a given pronunciation is to apply to all occurrences of a word) could be added later.

I'm sceptical of the practical usefulness of the idea. Few authors would feel the need to provide pronunciation information, and even fewer would provide it correctly (in the sense of describing the intended pronunciation properly by the rules of a phonetic notation). Browser vendors would have little motivation for investing on things like this, and even assistive software vendors might not be that interested, especially if they don't expect the feature to be used much (and properly) by authors.

On the other hand, HTML has had the lang attribute for a long time, and it is often mentioned as important in specifications (like WAI), yet ignored by almost all software that could make actual use of it (probably partly because it is known to be wrong information so often). A global attribute for pronunciation information (called "phonetic" or "pronounce" or something like that) would impose no support _requirements_ on browsers, and it could be defined in a simple manner.
Comment 3 Tab Atkins Jr. 2011-06-28 16:37:36 UTC
(In reply to comment #2)
> Despite the complexity of the topic, there is a fairly simple (and even
> obvious) possible approach: add an attribute, applicable to all elements,
> defined to contain the IPA presentation of the intended pronunciation of the
> textual content. If this becomes used by authors and utilized by browsing and
> other software, then new complexity (like allowing other phonetic notation
> systems as well, or specifying a way to indicate that a given pronunciation is
> to apply to all occurrences of a word) could be added later.
> 
> I'm sceptical of the practical usefulness of the idea. Few authors would feel
> the need to provide pronunciation information, and even fewer would provide it
> correctly (in the sense of describing the intended pronunciation properly by
> the rules of a phonetic notation). Browser vendors would have little motivation
> for investing on things like this, and even assistive software vendors might
> not be that interested, especially if they don't expect the feature to be used
> much (and properly) by authors.
> 
> On the other hand, HTML has had the lang attribute for a long time, and it is
> often mentioned as important in specifications (like WAI), yet ignored by
> almost all software that could make actual use of it (probably partly because
> it is known to be wrong information so often). A global attribute for
> pronunciation information (called "phonetic" or "pronounce" or something like
> that) would impose no support _requirements_ on browsers, and it could be
> defined in a simple manner.

If an idea has no known current use-cases, only theoretical future ones, we shouldn't add it.  In particular, we shouldn't add new features on the expectation that some time in future it will actually be used, because in the time between now and then the feature will likely become polluted.
Comment 4 Michael[tm] Smith 2011-08-04 05:36:12 UTC
mass-move component to LC1
Comment 5 Anne 2011-08-15 16:09:47 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: <http://dev.w3.org/html5/decision-policy/decision-policy.html>.

Status: Rejected
Change Description: no spec change
Rationale: Agreed with comment 3.