W3C

Voice Browser Working Group Charter

The mission of the Voice Browser Working Group, part of the Voice Browser Activity, is to enable users to speak and listen to Web applications by creating standard languages for developing Web-based speech applications. The Voice Browser Working Group concentrates on languages for capturing and producing speech and managing the dialog between user and computer, while a related Group, the Multimodal Interaction Working Group, concentrates on additional input modes including keyboard and mouse, ink and pen, etc.

Join the Voice Browser Working Group.

End date 31 May 2012
Confidentiality Proceedings are Member-only, but the group sends regular summaries of ongoing work to the public mailing list.
Initial Chairs Jim Larson, Scott McGlashan
Initial Team Contacts
(FTE %: 80)
Kazuyuki Ashimura, Matt Womer
Usual Meeting Schedule Teleconferences: Weekly
  • The group plans to hold separate telephone conferences for the following Task Forces as well as the main group:
    • VoiceXML 3.0
    • SIV (Speaker Identification and Verification functionality of VoiceXML 3.0)
    • SCXML 1.0
    • CCXML
    • SSML 1.0
Face-to-face: as required up to 3 per year

Scope

Background

The telephone was invented in the 1870s and continues to be a very important means for us to communicate with each other. The Web by comparison is very recent, but has rapidly become a competing communications channel. The convergence of telecommunications and the Web is now bringing the benefits of Web technology to the telephone, enabling Web developers to create applications that can be accessed via any telephone, and allowing people to interact with these applications via speech and telephone keypads. The W3C Speech Interface Framework is a suite of markup specifications aimed at realizing this goal. It covers voice dialogs, speech synthesis, speech recognition, telephony call control for voice browsers and other requirements for interactive voice response applications, including use by people with hearing or speaking impairments.

Some possible applications include:

Under previous charters, going back to 2000, The Voice Browser Working Group has created the W3C Speech Interface Framework suite of specifications, which includes:

In addition to the above, here is a list of documents produced by the Voice Browser Activity

Work to do

VoiceXML 3.0

VoiceXML 3.0 is the next major release of VoiceXML. VoiceXML 3.0 will provide powerful dialog capabilities that can be used to build advanced speech applications, and to provide these capabilities in a form that can be easily and cleanly integrated with other W3C languages. VoiceXML 3.0 will provide enhancements to existing dialog and media control, as well as major new features (e.g. multimedia prompts, VCR controls, speaker identification and verification, modularization, a cleaner separation between data/flow/dialog, and asynchronous external eventing) to facilitate interoperation with external applications and media components. The Group will create multiple profiles of VoiceXML 3.0 that enable subsets of VoiceXML 3.0 to target specific user cases. (e.g., a profile building upon VoiceXML 2.1 capability and a media profile with minimum dialogue requirement). The Group plans to take VoiceXML 3.0 through to Recommendation status.

State Chart XML (SCXML) 1.0

SCXML 1.0 is a generic XML control language based on Harel State Charts. Although SCXML was designed as a control language for VoiceXML 3.0 and for Multimodal Interaction dialog management, SCXML may also be used for control of other types of applications. The Group plans to take SCXML 1.0 through to Recommendation status.

Voice Browser Call Control (CCXML) 1.0

CCXML 1.0 is an XML language for controlling connections, conferences, and dialogs in a Voice Browser context. The Group plans to take CCXML 1.0 through to Recommendation status. The group may initiate work on the subsequent version of CCXML.

Speech Synthesis Markup Language (SSML) 1.1

SSML 1.1 enhances SSML 1.0 to better support widely spoken East-Asian, Indian and Middle Eastern languages in a manner that improves its usefulness in other languages as well. It also updates SSML 1.0 to be more consistent with PLS, SISR and expected VoiceXML 3.0 functionality. The Group plans to take SSML 1.1 through to Recommendation status. The group may work on minor extensions to 1.1.

Maintenance work

The Working Group will be maintaining its existing (or soon-to-be) Recommendations: VoiceXML 2.0, VoiceXML 2.1, SRGS 1.0, SSML 1.1, SISR 1.0, PLS 1.0, SCXML 1.0, and CCXML 1.0. Maintenance takes the form of: responding to questions and requests on the public mailing list, issuing errata as needed and possibly publishing minor updates to the specifications.

Note: The group plans to focus on VoiceXML 3.0 and SCXML 1.0 for the next three years, but may decide to do additional work on our existing recommendations.

Success Criteria

For each document to advance to proposed Recommendation, the group will typically produce a technical report with two independent and interoperable implementations for each required feature. The working group anticipates two interoperable implementations for each optional feature but may reduce the criteria for specific optional features.

Deliverables

The following documents are expected to become W3C Recommendations:

The following documents are either notes or are not expected to advance toward Recommendation:

The following documents may be revised depending upon the interest of working group members:

Milestones

Milestones
Note: The group will document significant changes from this initial schedule on the group home page.
Specification Requirements FPWD LC CR PR Rec
CCXML 1.0 Completed Completed Completed May 2009 September 2009 November 2009
SSML 1.1 Completed Completed Completed Completed July 2009 August 2009
VoiceXML 3.0 Completed Completed 2Q 2010 4Q 2010 2Q 2011 2Q 2011
SCXML 1.0 1Q2007 Completed 4Q 2010 1Q 2011 3Q 2011 4Q 2011

Dependencies

W3C Groups

The following groups are identified as being related to the work of this group.

Internationalization
The specifications of the VBWG are expected to be usable worldwide and be adapted to a wide variety all language. An ongoing strong relationship with the I18N groups is essential to achieve this goal.
Multimodal Interaction WG
The MMIWG has a strong link to the VBWG as it is chartered to develop specifications that allow to use the Web with using any modality, not just voice.
Synchronized Multimedia
VoiceXML 3.0 will introduce advanced media controls, involving timing and synchronization specification borrowed from SMIL. Collaboration on selecting which audio/video codecs are normative would be beneficial to both activities.
WAI Protocols and Format
The VBWG expects that its work will be reviewed by the WAI-PF group, in order to ensure universal accessibility of the produced specifications.
BackplaneXG
The "backplane" framework that is being developed by the groups belonging to the HCG: HTML, Web Applications, XForms, Compound Documents formats, etc. needs to be compatible with the VBWG's Data-Presentation-Flow framework, introduced in the design of VoiceXML 3.0.
XML and Semantic Web Activities
Because the specifications developed in the VBWG are all based on XML, the group will follow the work of the XML Activity in order to keep them compatible with the ongoing evolution of XML. Similarly, many specifications in the VBWG express metadata using RDF. Therefore, cooperation with the Semantic Web Best Practices is expected in case questions arise on the use of RDF.
Security
The Speaker Verification and Identification features of VoiceXML 3.0 will benefit from review from the Web Security Activity.
Video in the Web
VoiceXML 3.0 will allow video content to be presented to the user. Collaboration on selecting which audio/video codecs are normative would be beneficial to both activities.

Furthermore, Voice Browser Working Group expects to follow these W3C Recommendations:

External Groups

Here is a list of external groups with complementary goals to the Voice Browser activity:

ANSI/INCITS M1 and ISO/IEC JTC 1/SC 37
ANSI / INCITS M1 is a Technical Committee of INCITS for Biometrics standards. Its work includes data interchange formats, common file formats, application program interfaces, profiles, and performance testing and reporting. M1 also serves as the U.S. Technical Advisory Group (U.S. TAG) for the international organization ISO/IEC JTC 1/SC 37 on Biometrics. The Voice Browser Working Group plans to keep in touch with INCITS to identify which part of their specifications should be referenced from within VoiceXML 3.0. The group also plans to identify possible conflicts within their specifications and resolve the conflicts by synchronizing our respective standardization efforts.
OASIS BIAS Integration TC
OASIS BIAS Integration TC complements the efforts of ANSI/INCITS M1 to provide the biometrics and security industries with a documented, open framework for deploying and invoking identity assurance capabilities that can be readily accessed as services. The TC defines and describes methods and bindings by which the INCITS BIAS framework can be used within XML-based transactional Web services and service-oriented architectures. The Voice Browser Working Group plans to keep in touch with the TC and educate each other about what each organization is doing in order to solicit requirements for VoiceXML 3.0 and identify BIAS specification that should be referenced from within VoiceXML 3.0. The group also plans to identify possible conflicting specifications and resolve the conflicts by synchronizing our respective standardization efforts.
ETSI
works on DSR codecs, call control, human factors and command vocabularies.
IETF LTRU
prepares an update to the Language Subtag Registry procedures and deliver means to update the current IANA Language Subtag Registry.
IETF SPEECHSC working group or its successor
works on protocols for accessing speech engines. Develops MRCPv2 and any successor.
IETF Media Control
works on protocols and XML languages for media servers.
JCP
works on call control and media control API's inside java.
VoiceXML Forum

A memorandum of understanding exists between W3C and the VoiceXML Forum which basically states that:

  • The W3C will define dialog markup languages while the Forum concentrates on conformance, education, and marketing.
  • The VoiceXML Forum will coordinate the creation of test suites and conformance evaluation with the W3C.
  • The VoiceXML Forum will provide specification clarification requests to the W3C through the channels provided by the W3C for this purpose.

The VBWG will continue to respect this arrangement and furthermore plans to hold regularly scheduled joint meetings to coordinate conformance issues and other activities.

Participation

To be successful, the Voice Browser Working Group is expected to have 10 or more active participants for its duration. Effective participation to Voice Browser Working Group is expected to consume one work day per week for each participant; two days per week for editors. The Voice Browser Working Group will allocate also the necessary resources for building Test Suites for each specification.

In order to make rapid progress, the Voice Browser Working Group consists of several subgroups, each working on a separate document. The Voice Browser Working Group members may participate in one or more subgroups.

Participants are reminded of the Good Standing requirements of the W3C Process.

Experts from appropriate communities may also be invited to join the working group, following the provisions for this in the W3C Process.

Working Group participants are not obligated to participate in every work item, however the Working Group as a whole is responsible for reviewing and accepting all work items.

For budgeting purposes, the group may hold up to three full working group face-to-face meetings per year if the group believe it to be beneficial. Currently the Working Group anticipate holding a face-to-face meeting in association with the Technical Plenary but have no additional face-to-face meetings planned. The Chair will make Working Group meeting dates and locations available to the group in a timely manner according to the W3C Process. The Chair is also responsible for providing publicly accessible summaries of Working Group face to face meetings, which will be announced on www-voice@w3.org.

Communication

This group primarily conducts its work on the Member-only mailing list w3c-voice-wg@w3.org (archive). Certain topics need coordination with external groups. The Chair and the Working Group can agree to discuss these topics on a public mailing list. The archived mailing list www-voice@w3.org is used for public discussion of W3C proposals for Voice Browsers and for public feedback on the group's deliverables.

Information about the group (deliverables, participants, face-to-face meetings, teleconferences, etc.) is available from the Voice Browser Working Group home page.

All proceedings of the Working Group (mail archives, teleconference minutes, face-to-face minutes) will be available to W3C Members. Summaries of face-to-face meetings will be sent to the public list.

Decision Policy

As explained in the Process Document (section 3.3), this group will seek to make decisions when there is consensus. When the Chair puts a question and observes dissent, after due consideration of different opinions, the Chair should record a decision (possibly after a formal vote) and any objections, and move on.

This charter is written in accordance with Section 3.4, Votes of the W3C Process Document and includes no voting procedures beyond what the Process Document requires.

Patent Policy

This Working Group operates under the W3C Patent Policy (5 February 2004 Version). To promote the widest adoption of Web standards, W3C seeks to issue Recommendations that can be implemented, according to this policy, on a Royalty-Free basis.

For more information about disclosure obligations for this group, please see the W3C Patent Policy Implementation.

About this Charter

This charter for the Voice Browser Working Group has been created according to section 6.2 of the Process Document. In the event of a conflict between this document or the provisions of any charter and the W3C Process, the W3C Process shall take precedence.

Please also see the previous charter for this group.

The most important changes from the previous charter are:

This charter was extended through 31 May 2012 on 6 February 2012.


James A. Larson, Co-chair, Voice Browser Working Group
Kazuyuki Ashimura, Voice Browser Activity Lead

$Date: 2012/02/07 15:58:08 $