Voice Browser Working Group - Publications

Recommendations

- history

This document describes SCXML, or the "State Chart extensible Markup Language". SCXML provides a generic state-machine based execution environment based on CCXML and Harel State Tables.

- history

The Call Control Extensible Markup Language (CCXML) provides declarative markup to describe telephony call control. CCXML can be used in conjunction with a dialog system such as VoiceXML.

- history
1 translation for Speech Synthesis Markup Language (SSML) Version 1.1
日本語

The Voice Browser Working Group has sought to develop standards to enable access to the Web using spoken interaction. The Speech Synthesis Markup Language Specification is one of these standards and is designed to provide a rich, XML-based markup language for assisting the generation of synthetic speech in Web and other applications. The essential role of the markup language is to provide authors of synthesizable content a standard way to control aspects of speech such as pronunciation, volume, pitch, rate, etc. across different synthesis-capable platforms.

- history
1 translation for Pronunciation Lexicon Specification (PLS) Version 1.0
日本語

This document defines the syntax for specifying pronunciation lexicons to be used by speech recognition and speech synthesis engines in voice browser applications.

- history

VoiceXML 2.1 specifies a set of features commonly implemented by Voice Extensible Markup Language platforms. This specification is designed to be fully backwards-compatible with VoiceXML 2.0 [VXML2]. This specification describes only the set of additional features.

- history

This document defines the process of Semantic Interpretation for Speech Recognition and the syntax and semantics of semantic interpretation tags that can be added to speech recognition grammars to compute information to return to an application on the basis of rules and tokens that were matched by the speech recognizer. In particular, it defines the syntax and semantics of the contents of Tags in the Speech Recognition Grammar Specification.

Semantic Interpretation may be useful in combination with other specifications, such as the Stochastic Language Models (N-Gram) Specification, but their use with N-grams has not yet been studied.

Although the results of semantic interpretation are describing the meaning of a natural language utterance, the current specification does not specifically generate such information in the Natural Language Semantics Markup Language for the Speech Interface Framework. It is believed that semantic interpretation can produce information that can be encoded in the NL Semantics Markup Language, but this is not ensured or enforced.

- history
2 translations for Speech Synthesis Markup Language (SSML) Version 1.0
français
italiano

The W3C Voice Browser working group aims to develop specifications to enable access to the Web using spoken interaction. This document is part of a set of specifications for voice browsers, and provides details of an XML markup language for controlling speech synthesisers.

This document describes a XML markup language for generating synthetic speech via a speech synthesiser. Such synthesisers embody rich knowledge about how to render text, and the role of the markup language is to give authors a standard way to control aspects such as volume, pitch, rate and other properties.

- history
1 translation for Speech Recognition Grammar Specification Version 1.0
français

This document defines syntax for representating grammars for use in speech recognition so that developers can specify the words and patterns of words to be listened for by a speech recognizer. The syntax of the grammar format is presented in two forms, an augmented BNF syntax and an XML syntax. The specification intends to make the two representations directly mappable and allow automatic transformations between the two forms. The W3C Voice Browser Working Group is seeking input on whether the final specification should include both forms or be narrowed to a specific form.

- history
1 translation for Voice Extensible Markup Language (VoiceXML) Version 2.0
français

This document specifies VoiceXML, the Voice Extensible Markup Language. VoiceXML is designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed initiative conversations. Its major goal is to bring the advantages of Web-based development and content delivery to interactive voice response applications.

Notes

- history

This document describes the DOM Event I/O Processor for SCXML. This event processor allows SCXML state machines to communicate with external entities via DOM Events. For more details on Event I/O Processors, see the SCXML specification.

The category of this specification should be "Voice" but the original SCXML specification is included in "Declarative Web Applications" as well. So it would make sense to include this Note in that category as well.

- history

This document describes the XPath Data Model for SCXML. This data model allows SCXML state charts to use XML as their data representation, and to manipulate it with XPath. For more details on data models, see the SCXML specification.

The category of this specification should be "Voice" but the original SCXML specification is included in "Declarative Web Applications" as well. So it would make sense to include this Note in that category as well.

- history

The say-as element in SSML 1.0 is considered one of the most useful elements of the language. However, SSML 1.0 does not define the values of the attributes of this element. This Note provides definitions for these attributes that cover many of the most common use cases for the say-as element.

Working Drafts

- history

VoiceXML 3.0 is a modular XML language for creating interactive media dialogs that feature synthesized speech, recognition of spoken and DTMF key input, telephony, mixed initiative conversations, and recording and presentation of a variety of media formats including digitized audio, and digitized video. The primary goal of the spec is to bring the advantages of Web-based development and content delivery to interactive voice response applications.

- history

The W3C Voice Browser working group aims to develop specifications to enable access to the Web using spoken interaction. This document is part of a set of requirement studies for voice browsers, and provides details of the requirements for marking up spoken dialogs.

- history

In 2005 and 2006 the W3C held workshops to understand the ways, if any, in which the design of SSML 1.0 limited its usefulness for authors of applications in Asian, Eastern European, and Middle Eastern languages. In 2006 an SSML subgroup of the W3C Voice Browser Working Group was formed to review this input and develop requirements for changes necessary to support those languages. This document contains those requirements.

Retired specifications