XHTML+Voice Profile 1.0

1 Introduction

This section is informative.

The purpose of XHTML modularization [XHTML Modularization] (as expressed in XHTML 1.1 [XHTML 1.1] ) is to serve as the basis for future extended XHTML family document types, and to provide a consistent, forward-looking document type that is cleanly separated from the deprecated, legacy functionality of HTML 4. Thus, the XHTML 1.1 document type is essentially a reformulation of XHTML 1.0 Strict [XHTML 1.0] using XHTML Modules [XHTML Modularization].

Module XML-Events [XML Events] provides XML host languages the ability to uniformly integrate event listeners and associated event handlers with Document Object Model (DOM) Level 2 [DOM2 Events]event interfaces . The result is to provide XHTML based languages an event syntax that enables an interoperable way of associating behaviors with document-level markup.

VoiceXML 2.0 [VoiceXML 2.0] and the other XML vocabularies making up the W3C speech interface framework have been designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed-initiative conversations. In this document, we first modularize VoiceXML 2.0 to prepare it for integration into the XHTML family of languages using the XHTML modularization framework. We then integrate the resulting voice modules along with the XML events module into XHTML by defining an XHTML+Voice profile. This specification describes the VoiceXML modules that are added to XHTML and details the integration issues. The modularization of VoiceXML 2.0 also specifies DOM event types specific to voice interaction for use with the XHTML Events module. Speech dialogs authored in VoiceXML 2.0 can then be treated as event handlers that add voice-interaction specific behaviors to XHTML documents. The language integration supports all of the modules defined in XHTML Modularization, and adds speech interaction functionality to XHTML elements to enable multimodal applications. The document type defined by the XHTML+Voice profile is XHTML Host language document type conformant. A primary goal is to enable the integration of voice interaction into XHTML Basic for use on thin clients, while scaling up to today's desktop browsers.

1.1 Motivation And Applications

This note outlines how a set of mature WWW technologies including XHTML 1.1 [XHTML 1.1], VoiceXML 2.0 [VoiceXML 2.0], Speech Synthesis Markup Language [SSML 1.0], Speech Recognition Grammar Format [SRGF] and XML-Events [XML Events] can be integrated using XHTML modularization [XHTML Modularization] to bring spoken interaction to the WWW. The design leverages open industry APIs like the W3C DOM to create interoperable web content that can be deployed across a variety of end-user devices. Multiple modes of interaction are synchronized and integrated using the DOM2 Events model [DOM2 Events] and exposed to the content author via XML Events.

Today, WWW applications are authored in XHTML with user interaction created via XHTML form elements. W3C is presently working on XForms [XForms], the next generation of web forms that bring the power of XML to WWW application development. The combination of XHTML and voice described in this specification can leverage the semantic richness of web applications created using XForms, while providing a smooth transition for today's web developers wishing to deploy multimodal applications by adding spoken interaction to present-day web content. Integrating the work of the W3C voice browser working group into mainstream XHTML content has the additional advantage of being able to take advantage of future enhancements in the W3C speech interface framework such as natural language understanding. Thus, we provide a smooth transition path for web developers wishing to deliver increasingly smart user interaction for their WWW applications. At the same time, building on XHTML Basic [XHTML Basic] and XHTML modularization ensures that content developers will be able to deploy their content to a wide variety of end-user clients ranging from mobile phones and small PDAs to desktop browsers.

Using the functionality provided by the voice modules, this profile adds speech interaction functionality to standard user interface controls in XHTML. This provides an easy means of speech-enabling WWW applications by allowing Web developers to add voice interaction to standard WWW content. VoiceXML elements and constructs are included to permit the Web author easily create spoken interaction for specific parts of a standard WWW application. The integration provides a smooth means for moving from see-only WWW applications to WWW content that supports both visual and spoken interaction. Such combined (multimodal) interaction is crucial for next-generation multimodal devices. By integrating spoken interaction into the present WWW application authoring paradigm, this profile lowers the entry barrier for WWW developers wishing to add voice interaction to the visual WWW.

1.2 Design Rationale

This section provides the design rationale used to decide how we modularize VoiceXML 2.0. The goal is to modularize VoiceXML in a manner that permits the creation of profiles that match different application deployment environments. As an example, PDAs might not wish to include all of the telephony features from VoiceXML 2.0. To reflect the predominantly visual nature of today's WWW, we have chosen to make XHTML the host language; as a consequence, those parts of VoiceXML 2.0 that relate to the VoiceXML document being a stand-alone speech application are dropped from the XHTML+Voice profile.

2 Voice Modules

This section first modularizes VoiceXML 2.0 and then specifies the various voice modules used in the creation of the XHTML+Voice profile.

2.1 Modularization Of VoiceXML 2.0

The files making up the modularization of the VoiceXML 2.0 DTD are available as xhtml+voice-dtd.zip and have been created to ease the process of integrating VoiceXML 2.0 and XHTML. These modules do not change the VoiceXML 2.0 language as specified by the voice browser working group of the W3C. This section gives a high-level overview of each module.

File	Module	Purpose	Elements	XHTML+Voice
voicexml-events-1.mod	Events	Event types dispatched by Voice processor	`catch` `help` `noinput` `nomatch` `error` `throw`	Y
voicexml-exec-1.mod	Executable statements	Statements for use in voice handlers	`assign` `clear` `var` `log` `reprompt`	Y
voicexml-filled-1.mod	Filled	Voice handlers invoked when a slot is filled.	`filled`	Y
voicexml-flow-1.mod	Flow control	Flow control constructs from VoiceXML	`if` `else` `elseif` `return`	Y
voicexml-form-1.mod	Dialogs	Encapsulate voice dialogs	`form` `field` `record` `subdialog` `block` `initial` `option`	Y
voicexml-misc-1.mod	Miscellaneous	Non-local transfers in VoiceXML	`exit` `goto` `link` `script` `submit`	N
voicexml-menu-1.mod	Menus	VoiceXML menus	`menu` `choice` `enumerate`	N
voicexml-object-1.mod	Object	Foreign objects for VoiceXML	`object`	N
voicexml-resource-1.mod	Resources	Specifying voice resources	`param` `property`	Y
voicexml-root-1.mod	Root	VoiceXML stand-alone documents	`vxml` `meta`	N
voicexml-ssml-1.mod	SSML	Speech and audio output	`prompt` `value` `audio` `emphasis` `voice` `break` `prosody` `say-as` `phoneme` `paragraph` `p` `sentence` `s` `mark`	Y
voicexml-telephony-1.mod	Telephony	Telephony control	`transfer` `disconnect`	N
voicexml-grammar-1.mod	SRGF	Speech input constructs from VoiceXML	`grammar` `count` `example` `token` `import` `item` `one-of` `rule` `ruleref`	Y
voicexml-attribs-1.mod	Attributes	Common attributes used in VoiceXML		Y
voicexml-datatypes-1.mod	Datatypes	Common datatypes used in VoiceXML		N
voicexml-framework-1.mod	Framework	Creates modular framework for inclusion of other modules		N
voicexml-notations-1.mod	Notations	Defines XML and SGML notations		N
voicexml-qname-1.mod	QNames	Parameters and entities for qualified names (qnames)		Y
voicexml20-model-1.mod	Document Model	Defines content model for VoiceXML elements		Y

2.2 Speech And Non-speech Audio Output

Module voicexml-ssml-1.mod defines constructs for producing spoken and non-spoken audio output. These constructs are normatively defined in the SSML specification [SSML 1.0]. These constructs are used to author spoken prompts within voice handlers.

2.3 Speech Dialogs

Modules voicexml-exec-1.mod, voicexml-filled-1.mod, voicexml-resource-1.mod, voicexml-flow-1.mod, and voicexml-form-1.mod are used to author handlers that implement speech dialogs.

2.4 Speech Grammars

Module voicexml-grammar-1.mod provides constructs for authoring speech grammars. Speech grammars are normatively specified by the speech grammar specification [Speech Grammars].

2.5 VoiceXML Event Types

Module voicexml-events-1.mod declares the event types defined in VoiceXML 2.0 These event types are used in creating event listeners that respond to speech events.

2.6 VoiceXML Event Handlers

Modules voicexml-filled-1.mod, voicexml-flow-1.mod, voicexml-exec-1.mod, and voicexml-resource-1.mod declare constructs for use within voice handlers. The semantics of these constructs are as defined in the VoiceXML 2.0 specification.

3 Normative Definition Of Profile XHTML+Voice

This section is normative.

3.1 Document Conformance

A conforming XHTML+Voice document is a document that requires only the facilities described as mandatory in this specification. Such a document must meet all of the following criteria:

It must validate against the XML Schema found in schema provided in this document.
The root element of the document must be html.
The name of the default namespace on the root element must be the XHTML namespace name: http://www.w3.org/1999/xhtml.
If a DOCTYPE declaration is present and includes a public identifier, the DOCTYPE declaration must reference the DTD provided in this document using its Formal Public Identifier. The system identifier may be modified appropriately.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+Voice//EN" "http://www.w3.org/Voice/Group/2001/xhtml+voice10.dtd">

3.2 User Agent Conformance

The user agent must conform to the "User Agent Conformance" section of the XHTML specification ([XHTML 1.0], section 3.2) and the conformance requirements detailed in the VoiceXML modules ([VoiceXML 2.0]) supported by the integration profile.

The user agent must conform to the following additional user agent rule:

When the user agent claims to support facilities defined within the VoiceXML 2.0 specifications or facilities required by this specification through normative reference, it must do so in ways consistent with the facilities' definition.

3.3 XHTML Namespace Integration

In an XHTML document incorporating the voice functionality defined by the XHTML+Voice profile, the document's default XML namespace is still XHTML. Voice elements are included through an additional VXML namespace declaration:

The name of the unique identifier for the namespace within the document (in this example, vxml) is left to the discretion of the document author.

3.4 XHTML+Voice Profile

The XHTML functionality in the XHTML+Voice document type is based upon the XHTML modules defined in XHTML Modularization [XHTML Modularization]. The XHTML+Voice profile includes the XHTML modules defined in [XHTML Basic], such as the basic XHTML forms and tables modules. In addition, the XHTML+Voice document type supports the XHTML scripting module, and XML Events as defined by the XML Events module, [XML Events]. Finally, elements defined in the imported VoiceXML modules provide the ability to speech-enable XHTML constructs, and the VoiceXML event types and handlers allow the XHTML author to associate voice-interaction specific behaviors. The notation, terms and document conventions used here are borrowed from [XHTML 1.1].

The profile includes the following voice modules:

Speech and non-speech audio Output
Speech Dialogs
Speech Grammars
VoiceXML Event Types
VoiceXML Event Handlers

3.5 XHTML+Voice Modules

XHTML 1.1 is extended with voice modules by creating a new content model based on the XHTML 1.1 content model. The modifications include adding VoiceXML 2.0 with its content model, datatypes, and attributes to XHTML. This section specifies the modules needed to extend XHTML 1.1 with XML vocabularies defined as part of the W3C speech interface framework and create the XHTML+Voice profile.

File	Module	Purpose
xhtml+voice-model-1.mod	XHTML+Voice Document Model	Defines content model based on XHTML Basic for elements in XHTML+Voice
xhtml+voice-framework-1.mod	Framework	Includes the necessary modules for creating the XHTML+Voice profile
xhtml+voice-datatypes-1.mod	Datatypes	Imports VoiceXML datatypes into XHTML
xhtml+voice10.dtd	DTD	XHTML+Voice DTD
xhtml+voice.cat	Catalog	Catalog fragment for use with profile XHTML+Voice

3.6 Event types for XHTML+Voice

For a given XML language extended with XML Events, a set of event types must be specified independently of the [XML Events] module. The XML Event types supported by the XHTML+Voice profile includes all event types defined for [HTML 4.01] intrinsic events. VoiceXML handler activation is specified by including with an XHTML element one of these event types as an XML event, and an ID reference to the VoiceXML form as an XML event handler. The XHTML+Voice profile also supports VoiceXML 2.0 event types nomatch, noinput, error, and help. An additional event type, filled, is defined to have the same semantics as the VoiceXML element filled. Event filled is generated on the field or form level when a field is set after the prompted input matches the provided grammar.

Profile XHTML+Voice extends the XHTML script element with XML Events. Element script element does not generate any events of its own; hence attribute target is required to specify capturing an XML event. Element script can target any XHTML or VoiceXML element and can specify any HTML 4.01 intrinsic event or VoiceXML event.

The following table shows the correspondence between the XHTML+Voice event types with the XHTML or VoiceXML elements that support them:

Elements	Event Type
XHTML body	onload, onunload
Most XHTML elements	onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmouseout, onkeypress, onkeydown, onkeyup
VoiceXML form	nomatch, noinput, error, help, filled
XHTML elements: a, label, input, select, textarea, button	onfocus, onblur
XHTML form	onsubmit, onreset
XHTML elements: input, textarea	onselect
XHTML elements: input, select, textarea	onchange

4 Extending Profile XHTML+Voice

This section is normative.

In the future, profile XHTML+Voice may be extended by other W3C recommendations, or by private extensions. For these extensions, the following rules must be obeyed:

All elements introduced in extensions must have a skip-content attribute if it should be possible that their content is processed by XHTML+Voice user agents.
Private extensions must be introduced by defining a new XML namespace.

Conformant XHTML+Voice user agents should be prepared to handle documents containing extensions that obey these two rules.

A Reusable Voice Handlers

This section is informative.

A VoiceXML form, defined here as an event handler, is more practical if it can be placed in a linked document separate from the XHTML as a reusable component. Reusable components allow easier maintenance, and provide default behaviors that can be used as application building-blocks. VoiceXML includes a subdialog construct and its calling convention is close to what is required for a reusable component. The problem is that the caller must know both the subdialog's parameters and the fields included in the ECMAScript object returned to the caller.

It is not within the scope of this profile to attempt to solve the problem of creating reusable dialog components within VoiceXML; this is the domain of the W3C Voice Working Group. Authoring conventions can, however, be suggested which should work in most cases. A VoiceXML handler can be placed in a separate file and linked from within an XHTML+Voice profile document if:

The handler is a subdialog that has a fixed number of parameters and return fields that are named according to a fixed naming convention.
The subdialog is called by a VoiceXML form within the XHTML+Voice profile document. The calling VoiceXML form is the handler activated by an XML event.

The appendix includes an example of how a subdialog can be reused by following the above authoring conventions.

B Examples

This section is informative.

B.1 Basic Structure Of XHTML+Voice Documents

        
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML +Voice//EN" "xhtml+voice10.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:vxml="http://www.w3.org/2001/voicexml20" xmlns:ev="http://www.w3.org/2001/xml-events">
  <head>
    <title>Skeleton XHTML+Voice Document</title>
<!-- voice handlers -->
    <vxml:form id="sayHello">
      <vxml:block>Hello World</vxml:block>
    </vxml:form>
  </head>
  <body>
    <h1>Skeleton XHTML+Voice Document</h1>
    <p ev:event="onclick" ev:handler="#sayHello">
      This is a sample document that illustrates the markup
      structure of a conformant XHTML+Voice document.
      Notice that the default XML namespace is XHTML --and
      consequently, standard HTML element names do not need
      a namespace prefix.  We can add voice-interaction
      specific elements from the Voice XML 2.0 namespace
      using prefix <code>vxml</code>.  We can attach event
      handlers using prefix <code>ev</code>.  Clicking
      anywhere on this paragraph results in a welcome
      message being spoken on account of attaching a
      <code>vxml:form</code> handler to this paragraph.
    </p>
  </body>
</html>

B.2 What You See Is What You Can Say

        
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD  XHTML+Voice //EN" "xhtml+voice10.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:vxml="http://www.w3.org/2001/voicexml20" xmlns:ev="http://www.w3.org/2001/xml-events">
  <head>
    <title>What You See Is What You Can Say</title>
<!-- first declare the voice handlers. -->
    <vxml:form id="voice_city">
      <vxml:field name="field_city">
        <vxml:grammar src="city.srgf" type="application/x-srgf"/>
        <vxml:prompt id="city_prompt">
          Please choose a city.
        </vxml:prompt>
        <vxml:catch event="help nomatch noinput">
          For example, say Chicago.
        </vxml:catch>
      </vxml:field>
    </vxml:form>
    <vxml:form id="voice_hotel">
      <vxml:field name="field_hotel">
        <vxml:grammar src="hotel.srgf" type="application/x-srgf"/>
        <vxml:prompt id="hotel_prompt">
          Select your hotel
        </vxml:prompt>
        <vxml:catch event="help nomatch noinput">
          For example, say Hilton.
        </vxml:catch>
        <vxml:filled>
          <vxml:prompt>
            You have chosen to stay at the 
            <vxml:value expr="field_hotel"/>.
          </vxml:prompt>
        </vxml:filled>
      </vxml:field>
    </vxml:form>
<!-- done voice handlers. -->
  </head>
  <body>
    <h1>What You See Is What You Can Say</h1>
    <p>This example demonstrates a simple voice-enabled GUI
      hotel picker  that permits the user to provide input
      using traditional GUI input peripherals,
      or speak the same information.
    </p>
    <h2>Hotel Picker</h2>
    <form id="hotel_query" method="post" action="cgi/hotel.pl">
      <p>Select a hotel in a city:</p>
      <input name="city" type="text" ev:event="onfocus" ev:handler="#voice_city"/>
      <input name="hotel" type="text" ev:event="onfocus" ev:handler="#voice_hotel"/>
<!-- Declare xhtml script handlers for setting inputs -->
      <script ev:target="#voice_city" ev:event="vxml:filled">
        city = field_city;
      </script>
      <script ev:target="#voice_hotel" ev:event="vxml:filled">
        hotel = field_hotel;
      </script>
<!-- done xhtml script handlers -->
      <input type="submit" value="Submit"/>
      <input type="reset"/>
    </form>
  </body>
</html>

B.3 Mixed-initiative Conversational Interface

        
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+Voice //EN" "xhtml+voice10.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:vxml="http://www.w3.org/2001/voicexml20" xmlns:ev="http://www.w3.org/2001/xml-events">
  <head>
    <title>Mixed Initiative Conversational Interface</title>
<!-- first declare the voice handlers. -->
<!-- VXML form supporting a mixed-initiative grammar -->
    <vxml:form id="voice_city_hotel">
      <vxml:grammar src="city_hotel.srgf" type="application/x-srgf"/>
<!-- Mixed initiative form begins with initial prompt -->
      <vxml:initial name="start">
        <vxml:prompt>
             Please choose a city and hotel where you wish to stay.
          </vxml:prompt>
        <vxml:help>
            Please say the name of a city and a hotel to make 
            a reservation.
          </vxml:help>
<!-- If user is silent, reprompt once, then try 
               directed prompts. -->
        <vxml:noinput count="1">
          <vxml:reprompt/>
        </vxml:noinput>
        <vxml:noinput count="2">
          <vxml:reprompt/>
          <vxml:assign name="start" expr="true"/>
        </vxml:noinput>
      </vxml:initial>
      <vxml:field name="field_city">
        <vxml:grammar src="city.srgf" type="application/x-srgf"/>
        <vxml:prompt id="city_prompt">
             Please choose a city.
          </vxml:prompt>
        <vxml:catch event="help nomatch noinput">
            For example, say Chicago.
          </vxml:catch>
        <vxml:filled>
<!-- Use assign to set the xhtml input -->
          <vxml:assign name="document.city" expr="field_city"/>
        </vxml:filled>
      </vxml:field>
      <vxml:field name="field_hotel">
        <vxml:grammar src="hotel.srgf" type="application/x-srgf"/>
        <vxml:prompt id="hotel_prompt">
              Select your hotel
          </vxml:prompt>
        <vxml:catch event="help nomatch noinput">
            For example say Hilton.
          </vxml:catch>
        <vxml:filled>
          <vxml:prompt>
                You have chosen to stay at the 
<vxml:value expr="field_hotel"/>.
	        </vxml:prompt>
          <vxml:assign name="document.hotel" expr="field_hotel"/>
        </vxml:filled>
      </vxml:field>
    </vxml:form>
<!-- done voice handlers -->
  </head>
  <body>
    <h1>Mixed-Initiative Conversational Interface</h1>
    <p>In this example, we demonstrate how the earlier example can
       be easily extended to support mixed-initiative dialog.  By 
       activating a grammar capable of recognizing both cities and
       hotel names for the entire application, the user can specify
       both hotel and city in a single utterance.  Alternatively,
       the user can fill one field at a time.
    </p>
    <h2>Hotel Picker</h2>
    <p>This voice-enabled application lets you pick a 
       city and a hotel.
    </p>
    <form id="xhtml_city_hotel" method="post" action="cgi/hotel.pl">
      <p>Select a hotel in a city:</p>
      <input name="city" type="text" ev:event="onfocus" ev:handler="#voice_city_hotel"/>
      <input name="hotel" type="text"/>
      <input type="submit" value="Submit"/>
      <input type="reset"/>
    </form>
  </body>
</html>

B.4 Speech-Enabled Mail Interface

This email message from the W3C voice browser working group archives has been speech-enabled to allow easy browsing of email on hand-held devices.

        
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+Voice //EN" "xhtml+voice10.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:vxml="http://www.w3.org/2001/voicexml20" xmlns:ev="http://www.w3.org/2001/xml-events">
  <head>
    <title>Speech-enabled Email Browser</title>
    <script language="javascript">
      // define array holding command words -&gt; activate-id map.
      //
      //define function that takes a command word,
      // looks it up in the afore-mentioned map,
      // and activates the link.
      //function activate (command) {
      //...
      //
    </script>
    <script ev:target="#command-and-control" ev:event="filled">
      activate(word.value);
    </script>
    <vxml:form id="command-and-control">
<!-- your word is my command. -->
      <vxml:field name="word">
        <vxml:grammar src="mail.srgf"/>
        <vxml:catch event="help nomatch">
            This mail reader is speech-enabled. You can
            perform available actions via speech input.
          </vxml:catch>
      </vxml:field>
    </vxml:form>
  </head>
  <body ev:event="onload" ev:handler="#command-and-control"><h1>W3C Speech Interaction Framework</h1><strong>From:</strong> T. V. Raman
<a href="mailto:tvraman@us.ibm.com?Subject=Re:%20W3C%20Speech%20Interface%20Framework"><em>tvraman@us.ibm.com</em></a>)<br/><strong>Date:</strong> Sat, Jan 01 2000 
    <ul class="noindent"><li><strong>Next message:</strong><a id="__next_message" href="0093.html">
 mxd@cisco.com: &quot;Re: [dialog] &lt;record&gt;'s dest attribute&quot;
          </a></li></ul><ul><li><strong>Previous message:</strong><a id="__prev_message" href="0091.html">
 Harish Varanasi: &quot;RE: [ dialog ] &lt;record&gt;'s dest attribute&quot;
          </a></li><li><strong>Messages sorted by:</strong><a id="__sort_by_date" href="index.html#92">
              [ date ]</a><a id="__sort_by_thread" href="thread.html#92">
              [ thread ]</a><a id="__sort_by_subject" href="subject.html#92">
              [ subject ]</a><a href="author.html#92">[ author ]</a></li><li><strong>Other mail archives:</strong><a id="__more_from_this_list" href="../">
            [ this mailing list ]</a><a id="__ohter_w3c_lists" href="../../">
            [ other W3C mailing lists ]</a></li><li><strong>Mail actions:</strong><a id="__reply_to_this_message">    href=&quot;mailto:w3c-voice-wg@w3.org&quot;
          [ respond to this message ]</a><a id="__mail_new_topic" href="mailto:w3c-voice-wg@w3.org">
     [ mail a new topic ]</a></li></ul><hr noshade="noshade"/><pre>
Message body was here.
    </pre><hr noshade="noshade"/></body>
</html>

B.5 Reusable Voice Subdialogs

A flight query is processed with two reusable voice subdialogs. One subdialog processes both arrival and departure city or airport, the other arrival and departure dates.

        
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+Voice //EN" "xhtml+voice10.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:vxml="http://www.w3.org/2001/voicexml20" xmlns:ev="http://www.w3.org/2001/xml-events">
  <head>
    <title>Flight Query</title>
    <script src="cityorairport.es">
        var objCityOrAirport = new CityOrAirport();
      </script>
    <script src="dateinfo.es">
	  var objDateInfo = new DateInfo();
	</script>
    <vxml:form id="voice_city_from">
      <vxml:subdialog name="cityorairport" src="cityorairport.vxml">
        <vxml:param name="paramSubdialogObj" expr="objCityOrAirport"/>
        <vxml:param name="paramPromptQuestion" expr="'What city or airport are you departing from?'"/>
        <vxml:filled>
          <vxml:prompt>
            You are departing from
<value expr="cityorairport.returnCityOrAirport"/>.
            </vxml:prompt>
          <vxml:assign name="document.from" expr="cityorairport.returnCityOrAirport"/>
        </vxml:filled>
      </vxml:subdialog>
    </vxml:form>
    <vxml:form id="voice_city_to">
      <vxml:subdialog name="cityorairport" src="cityorairport.vxml">
        <vxml:param name="paramSubdialogObj" expr="objCityOrAirport"/>
        <vxml:param name="paramPromptQuestion" expr="'At what city or airport are you arriving?'"/>
        <vxml:filled>
          <vxml:prompt>
            You are arriving at
<value expr="cityorairport.returnCityOrAirport"/>.
            </vxml:prompt>
          <vxml:assign name="document.to" expr="cityorairport.returnCityOrAirport"/>
        </vxml:filled>
      </vxml:subdialog>
    </vxml:form>
    <vxml:form id="voice_date_from">
      <vxml:subdialog name="dateinfo" src="dateinfo.vxml">
        <vxml:param name="paramSubdialogObj" expr="objDateInfo"/>
        <vxml:param name="paramPromptQuestion" expr="'What day, month, and year are you leaving?'"/>
        <vxml:filled>
          <vxml:prompt>
            You are departing on <value expr="dateinfo.returnDateInfo"/>.
            </vxml:prompt>
          <vxml:assign name="document.fromDate" expr="dateinfo.returnDateInfo"/>
        </vxml:filled>
      </vxml:subdialog>
    </vxml:form>
    <vxml:form id="voice_date_to">
      <vxml:subdialog name="dateinfo" src="dateinfo.vxml">
        <vxml:param name="paramSubdialogObj" expr="objDateInfo"/>
        <vxml:param name="paramPromptQuestion" expr="'What day, month, and year are you arriving?'"/>
        <vxml:filled>
          <vxml:prompt>
            You are arriving on <value expr="dateinfo.returnDateInfo"/>.
		  <vxml:assign name="document.toDate" expr="dateinfo.returnDateInfo"/>
            </vxml:prompt>
        </vxml:filled>
      </vxml:subdialog>
    </vxml:form>
  </head>
  <body>
    <h1>Multimodal Flight Query</h1>
    <form method="post" action="/servlet/flightServlet">
      <table border="0" summary="Leave and return airport, date, and time">
        <tr>
          <td width="15%">
            <label for="from">Leaving From:</label>
          </td>
          <td colspan="2">
            <input type="text" id="from" size="20" ev:event="onclick" ev:handler="#voice_city_from"/>
          </td>
        </tr>
        <tr>
          <td width="15%">
            <label for="to">Arriving At:</label>
          </td>
          <td colspan="2">
            <input type="text" id="to" size="20" ev:event="onclick" ev:handler="voice_city_to"/>
          </td>
        </tr>
        <tr>
          <td width="15%">
            <label for="fromDate">Travel Date:</label>
          </td>
          <td width="35%">
            <input type="text" id="fromDate" size="20" ev:event="onclick" ev:handler="voice_date_from"/>
          </td>
          <td width="50%">
            <div class="c1">
              <label>Time of Day:</label>
              <br/>
              <table width="100%" border="0" summary="leave am or pm">
                <tr>
                  <td align="left">
                    <input type="checkbox" id="departam" value="checkbox"/>
                    <label for="departam">am</label>
                  </td>
                  <td align="left">
                    <input type="checkbox" id="departpm" value="checkbox"/>
                    <label for="departpm">pm</label>
                  </td>
                </tr>
              </table>
            </div>
          </td>
        </tr>
        <tr>
          <td width="15%">
            <label for="toDate">Return Date:</label>
          </td>
          <td width="35%">
            <input type="text" id="toDate" size="20" ev:event="onclick" ev:handler="voice_date_to"/>
          </td>
          <td width="50%">
            <div class="c1">
              <label>Time of Day:</label>
              <br/>
              <table width="100%" border="0" summary="return am or pm">
                <tr>
                  <td align="left">
                    <input type="checkbox" id="departam2" value="checkbox"/>
                    <label for="departam2">am</label>
                  </td>
                  <td align="left">
                    <input type="checkbox" id="departpm2" value="checkbox"/>
                    <label for="departpm2">pm</label>
                  </td>
                </tr>
              </table>
            </div>
          </td>
        </tr>
      </table>
      <br/>
      <table align="center">
        <tr>
          <td align="center" width="80%">
            <input type="submit" value="Submit"/>
          </td>
          <td>
            <input type="reset"/>
          </td>
        </tr>
      </table>
    </form>
  </body>
</html>

C DTD

This section defines the DTD used to formally define the XHTML+Voice integration profile. This section is normative.

C.1 xhtml+voice10.dtd

The individual modules making up the DTD for profile xhtml+voice10 along with the top-level driver file are packaged together and available with this note --see xhtml+voice-dtd.zip.

D Schema

This section defines the formal XML Schema used to define the XHTML+Voice profile. This section is normative.

The files defining the XHTML+Voice profile are available as a zip archive (xhtml+voice-schema.zip with this note.

E References

E.1 Normative References

XForms: XForms 1.0 , Micah Dubinko, Josef Dietl, Roland Merrick,Dave Raggett, T. V. Raman, Linda Bucsay Welsh 2001
XHTML Basic: XHTML Basic , 19 December 2000, Mark Baker, Masayasu Ishikawa, Shinichi Matsui, Peter Stark, Ted Wugofski, Toshihiko Yamakami
CSS2: Cascading Style Sheets, level 2 (CSS2) Specification, Bert Bos, Håkon Wium Lie, Chris Lilley, Ian Jacobs, 1998. W3C Recommendation available at: http://www.w3.org/TR/REC-CSS2.
DOM2 Events: Document Object Model (DOM) Level 2 Events Specification, Tom Pixley, 2000. W3C Recommendation available at: http://www.w3.org/TR/DOM-Level-2-Events/.
HTML 4.01: HTML 4.01 Specification, Dave Raggett, Arnaud Le Hors, Ian Jacobs, 1999. W3C Recommendation available at: http://www.w3.org/TR/html4/.
RFC 2396: RFC 2396: Uniform Resource Identifiers (URI): Generic Syntax., Tim Berners-Lee, et. al, 1998. Available at: http://www.ietf.org/rfc/rfc2396.txt.
WML1.3: Wireless Application Protocol Wireless Markup Language Specification Version 1.3, Wireless Application Protocol Forum, Ltd., 2000. Available at: http://www1.wapforum.org/tech/documents/WAP-191-WML-20000219-a.pdf.
XML Events: xml Events - An events syntax for XML, Steven Pemberton, T. V. Raman and Shane P McCarron, 2001. W3C Working Draft available at: http://www.w3.org/TR/xhtml-events.
Speech Grammars: Speech Recognition Grammar Format (Members only), Andrew Hunt and Scott McGlashan, 9th May 2001 available at: http://www.w3.org/Voice/Group/2001/grammar-spec-20010509.html
SSML 1.0: Speech Synthesis Markup Language Specification, Mark Walker and Andrew Hunt, 8th August 2000 available at: http://www.w3.org/TR/speech-synthesis
SRGF: Speech Recognition Grammar Specification for the W3C Speech Interface Framework, Andrew Hunt, SpeechWorks International Scott McGlashan, PipeBeach available at: http://www.w3.org/tr/speech-grammar/
VoiceXML 2.0: Voice Extensible Markup Language (VoiceXML) , Scott McGlashan et al, available at: http://www.w3.org/tr/voicexml20
XHTML Modularization: Modularization of XHTML Murray Altheim, Frank Boumphrey, Sam Dooley, > Shane McCarron, Sebastian Schnitzenbaumer, Ted Wugofski available at: http://www.w3.org/TR/xhtml-modularization/
XHTML 1.1: XHTML 1.1 - Module-based XHTML Murray Altheim, Shane McCarron available at: http://www.w3.org/TR/xhtml11/
XLink: XML Linking Language (XLink) Version 1.0, Steve DeRose, Eve Maler, David Orchard, 2000. W3C Proposed Recommendation available at: http://www.w3.org/TR/xlink/.
XML 1.0: Extensible Markup Language (XML) 1.0 (Second Edition), Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler, 2000. W3C Recommendation: available at: http://www.w3.org/TR/REC-xml
XML Names: Namespaces in XML, Tim Bray, Dave Hollander, Andrew Layman, 1999. W3C Recommendation available at: http://www.w3.org/TR/REC-xml-names.
XPath 1.0: XML Path Language (XPath) Version 1.0, James Clark, Steve DeRose, 1999. W3C Recommendation available at: http://www.w3.org/TR/xpath.
XSchema-1: XML Schema Part 1: Structures, Henry S. Thompson, David Beech, Murray Maloney, Noah Mendelsohn, 2001. W3C Recommendation available at: http://www.w3.org/TR/xmlschema-1/.
XSchema-2: XML Schema Part 2: Datatypes, Paul V. Biron, Ashok Malhotra, 2001. W3C Recommendation available at: http://www.w3.org/TR/xmlschema-2/.
XHTML 1.0: XHTML 1.0: The Extensible HyperText Markup Language - A Reformulation of HTML 4 in XML 1.0, Steven Pemberton, et. al, 2000. W3C Recommendation available at: http://www.w3.org/TR/xhtml1.

E.2 Informative References

ECMA 262: ECMA-262: ECMAScript Language Specification, European Computer Manufacturers' Association (ECMA), 1999. Available at ftp://ftp.ecma.ch/ecma-st/Ecma-262.pdf.
RFC 2141: RFC 2141: URN Syntax, R. Moats, 1997. Available at: http://www.ietf.org/rfc/rfc2141.txt.
XSchema-0: XML Schema Part 0: Primer, David C. Fallside, 2001. W3C Recommendation available at: http://www.w3.org/TR/xmlschema-0/.
XSLT: XSL Transformations (XSLT) Version 1.0, James Clark, 1999. W3C Recommendation available at: http://www.w3.org/TR/xslt.