W3C Workshop on Internationalizing the Speech Synthesis Markup Language III — Summary

13-14 January 2007

On 13-14 January the Voice Browser Working Group held the third Workshop on Internationalizing SSML in Hyderabad, India, hosted by Bhrigus and IIIT Hyderabad.

The minutes of the workshop are available on the W3C Web server:
http://www.w3.org/2006/10/SSML/minutes.html

There were more than 15 attendees from India, Sri Lanka, Pakistan, Japan, Italy, US, France, and the UK.

Motivation for internationalizing SSML includes:

It is estimated that within 3 years the World Wide Web will contain significantly more content from currently under-represented languages.
There is great need for SSML to work for languages beyond those supported by current version (=SSML 1.0).
Some languages such as Mandarin Chinese or Hindi are difficult to input via a telephone keypad.
Many other languages would also benefit from a new "international" version of SSML, and it would help spread the Web to places where it is not so readily accessible.

This workshop was more narrowly focused than the previous workshops, specifically targeting languages of the Indian subcontinent. Topics discussed during the Workshop included:

Language-specific issues (Echo expressions, word compounding, optional/missing diacritics, ...)
Alternative/mixed-language support (loan words, broader language/dialect/script support, mixed language text)
Pronunciation alphabets (non-IPA and syllable-based pronunciation alphabets)
Other items (proper name identification, say-as extensions)

The major "takeaways" are:

Current work on SSML 1.1 will address many of the needs of Indian language authors.
Word compounds must be treatable as a single lexical unit
Authors should be able to indicate when special, eg. expensive, processing should occur, for example word segmentation or diacritic restoration in Urdu.
Authors should have control over processor behavior when a requested voice can't speak given language content -- mechanisms proposed in the first Working Draft of SSML 1.1 are still insufficient for proper development of multilingual applications.
Transliteration is common for Indian languages and is a transformation that must be performed before text normalization. Existing mechanisms in SSML 1.0 are insufficient to address this.

We have started to review these new topics and will continue to do so as we continue the work on SSML 1.1 in the next face-to-face meeting in Beijing.

Daniel C. Burnett and Kazuyuki Ashimura, Workshop Co-chairs

The Call for Participation, the Logistics, the Agenda, the Presentation guideline and the Minutes are also available.

Dan Burnett and Kazuyuki Ashimura, Workshop Co-chairs
Max Froumentin, Voice Activity Lead

$Id: summary.html,v 1.3 2007/02/05 18:31:37 ashimura Exp $