W3C

Voice Extensible Markup Language (VoiceXML) Version 2.0
Candidate Recommendation Disposition of Comments

This version:
January 19, 2004
Editor:
Scott McGlashan, Hewlett-Packard

Abstract

This document details the responses made by the Voice Browser Working Group to issues raised during the Candidate Recommendation (beginning 28th January 2003 and ending 10th April 2003) review of Voice Extensible Markup Language (VoiceXML) Version 2.0 . Comments were provided by Voice Browser Working Group members, other W3C Working Groups, and the public via the www-voice-request@w3.org (archive) mailing list.

Status

This document of the W3C's Voice Browser Working Group describes the disposition of comment as of January 19, 2004 on Voice Extensible Markup Language (VoiceXML) Version 2.0 Candidate Recommendation. It may be updated, replaced or rendered obsolete by other W3C documents at any time.

For background on this work, please see the Voice Browser Activity Statement.

Table of Contents


1. Introduction

This document describes the disposition of comments in relation to the Voice Extensible Markup Language (VoiceXML) Version 2.0 (http://www.w3.org/TR/2003/CR-voicexml20-20030220/). Each issue is described by the name of the commentator, a description of the issue, and either the resolution or the reason that the issue was not resolved.

The full set of Issues raised for the Voice Extensible Markup Language (VoiceXML) Version 2.0 since August 2000, their resolution and in most cases the reasoning behind the resolution are available from http://www.w3.org/Voice/Group/2004/voicexml-change-requests.htm [W3C Members Only]. This document provides the analysis of the issues that were submitted and resolved as part of the Last Call Review.

Notation: Each original comment is tracked by a "(Change) Request" [R] designator. Each point within that original comment is identified by a point number. For example, "R5-1" is the first point in the fifth change request for the specification.

2. Comments

Item Commentator Nature Disposition
CR1-1    Arnaud Vallee    Clarification / Typographical / Editorial (§2.1)     accepted (no-reply)   
CR2-1    Arnaud Vallee    Technical Error (§2.2)     accepted (no reply)   
CR3-1    Arnaud Vallee    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR4-1    Arnaud Vallee    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR5-1    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-2    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-3    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-4    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-5    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-6    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-7    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-8    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-9    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-10    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-11    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-12    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-13    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-14    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-15    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR5-16    Guillaume Berche     Change to Existing Feature (§2.3)     accepted   
CR6-1    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR6-2    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR6-3    Guillaume Berche     Change to Existing Feature (§2.3)     accepted   
CR6-4    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR6-5    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR6-6    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR6-7    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR6-8    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR6-9    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR6-10    Guillaume Berche     Technical Error (§2.2)     accepted   
CR6-11    Guillaume Berche     Technical Error (§2.2)     accepted   
CR6-12    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR6-13    Guillaume Berche     Clarification / Typographical / Editorial (§2.1)     accepted   
CR7-1    Max Froumentin     Clarification / Typographical / Editorial (§2.1)     accepted   
CR8-1    Matt Porter    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR9-1    John Voger    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR10-1    Philippe Le Hegaret     Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR11-1    C. M. Sperberg-McQueen    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR11-2    C. M. Sperberg-McQueen    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR11-3    C. M. Sperberg-McQueen    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR11-4    C. M. Sperberg-McQueen    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR11-5    C. M. Sperberg-McQueen    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR11-6    C. M. Sperberg-McQueen    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR11-7    C. M. Sperberg-McQueen    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR11-8    C. M. Sperberg-McQueen    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR12-1    Guillaume Berche    Clarification / Typographical / Editorial (§2.1)     accepted   
CR12-2    Guillaume Berche    Clarification / Typographical / Editorial (§2.1)     accepted   
CR12-3    Guillaume Berche    Clarification / Typographical / Editorial (§2.1)     accepted   
CR13-1    Greg FitzPatrick    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR14-1    Guillaume Berche    Clarification / Typographical / Editorial (§2.1)     accepted   
CR14-2    Guillaume Berche    Clarification / Typographical / Editorial (§2.1)     accepted   
CR15-1    Ufuk Kayserilioglu    Clarification / Typographical / Editorial (§2.1)     accepted   
CR16-1    Mark Clark    Clarification / Typographical / Editorial (§2.1)     accepted (no reply)   
CR17-1    Robert Barkan    Clarification / Typographical / Editorial (§2.1)     accepted   
CR18-1    Mark Clark    Change to Existing Feature (§2.3)     accepted (no reply)   
CR19-1    Pavel Cenek     Feature Request (§2.4)     accepted   
CR19-2    Pavel Cenek     Feature Request (§2.4)     accepted   

2.1 Clarifications, Typographical, and Other Editorial

Issue CR1-1

From Arnaud Vallee

I have a question about where the error.badfetch is thrown and caught when a
called document has non existent root document.

Take the following scenario.

The document 1 makes a transition to document 2 whose root document does not
exist.  document 1 and document 2 have error.badfetch handler at the document
level.  Where is the error supposed to be caught?

I think the question could be the same for the following assertion: If a
document's application attribute refers to a document that also has an
application attribute specified, an error.semantic event is thrown.

As i did not get any anwer to the message, i post my query one more time.
The issue is as follows:

    In a document named doc1.vxml, which is a root document (do not
    specify an application attribute in the vxml tag), we transition to a
    document doc2.vxml.  doc2.vxml refers to a non existing root document
    (i.e., application attribute set to doc2-root-unexisting.vxml).

As the spec says (chap 1.5.2), " If a document refers to a non-existent
application root document, an error.badfetch event is thrown ", an
error.badfetch is thrown in this case.

The question: where is the error thrown, or in other way, where do i put the
error.badfetch handler to catch the error?

I see 2 possibilities:
- in doc1.vxml, which means that if a document refers to a non existing root
document, it is a badfecth to try to get this document.
- in doc2.vxml, which means that current document has to be initialized before
getting and initializing the root document.

I think this is the same issue with the following assertion in chapter 1.5.2:
"If a document's application attribute refers to a document that also has an
application attribute specified, an error.semantic event is thrown. "

except that, in this case, the error.semantic could also be catched in the
first root document.

Analysis:
[Pavel Cenek]
I am not member of WBWG, so my answer is only a guess. I also waited for 
an authorized answer and therefore haven't reacted on your first attempt.

> The issue is as follows:

> In a document named doc1.vxml, which is a root document (do not specify an
application attribute in the vxml tag), we transition to a document doc2.vxml.
> doc2.vxml refers to a non existing root document (i.e., application
attribute set to doc2-root-unexisting.vxml).  
>

> As the spec says (chap 1.5.2), 
> " If a document refers to a non-existent application root document, an 
error.badfetch event is thrown ",
> an error.badfetch is thrown in this case.
> 
> The question: where is the error thrown, or in other way, where do i put 
> the error.badfetch handler to catch the error?

The transition is caused by <goto> or <submit>, etc, therefore I would 
apply the rules for these tags (which should be the same for all of 
them). For <goto>, spec says:
"Note that for errors which occur during a dialog or document 
transition, the scope in which errors are handled is platform specific."

> I see 2 possibilities: 
> - in doc1.vxml, which means that if a document
refers to a non existing root document, it is a badfecth to try to get this
document.

In my opinion this possibility is more logical.

> - in doc2.vxml, which means that current document has to be initialized
before getting and initializing the root document.

> I think this is the same issue with the following assertion in chapter
1.5.2: > "If a document's application attribute refers to a document that also
has an application attribute specified, an error.semantic event is thrown. "

I think it would be valuable to mention the citation above also in the 
chapter one.

Resolution: rejected

The specification allows the error.badfetch event to be thrown in either the referring document or the referred document. To guarantee that the error is caught, catch handlers need to be specified in both documents. This error handling pattern is illustrated in numerous tests in our implementation report.

Email Trail:

Issue CR3-1

From Arnaud Vallee

chapter 2.4 of the VoiceXML (24 April 2002)

Attributes of filled are:

mode Either all (the default), or any. If any, this action is executed when
     any of the specified input items is filled by the last user input. If
     all, this action is executed when all of the mentioned input items are
     filled, and at least one has been filled by the last user input. A
     <filled> element in an input item cannot specify a mode.

namelist The input items to trigger on. For a <filled> in a form, namelist
     defaults to the names (explicit and implicit) of the form's input
     items. A <filled> element in an input item cannot specify a
     namelist; the namelist in this case is the input item name. Note that
     control items are not permitted in this list.

As i understand these attributes are not permitted in filled elements which
are child of input item.  But the spec do not say what happens in this case:
- ignore those attributes?
- throw an error (semantic)?

Furthermore, control items items are not permitted in namelist. I suppose any
other ECMA variable are not permitted neither. But how a voice browser should
handle that case? Ignore the non-input variable elements or throw an error
(semantic)?

Resolution: accepted with modifications

The specification will be modified so that upon encountering a document containing a <filled> element specifying either a 'mode' or 'namelist' attribute as a child of an input item, then an error.badfetch is thrown by the platform. In addition, the specification will also make clear that an error.badfetch is thrown when the document contains a <filled> element with a namelist attribute referencing a control item variable.

Email Trail:

Issue CR4-1

From Arnaud Vallee

The bargeintype propery is defined as follows:

"speech: The prompt will be stopped as soon as speech or DTMF input is
detected. The prompt is stopped irrespective of whether or not the input
matches a grammar. "

Would this mean that even if no dtmf grammar is active and the user enter a
dtmf, the prompt should be stopped?

Resolution: accepted with modifications

Yes. If bargeintype is speech then the prompt will be stopped as soon as speech or DTMF input is detected regardless of if it is a match or not. Having dtmf grammars active or not does not effect this. Setting the inputmodes to voice should prevent the DTMF from barging in on the prompts (although some platforms may have difficulty separating in-band DTMF from speech). The specification will be clarified as follows: addition of the words "and irrespective of which grammars are active." to the end of the sentence "The prompt is stopped irrespective of whether or not the input matches a grammar" from table 38.

Email Trail:

Issue CR5-1

From Guillaume Berche

0- Precise the value of the _dtmf special variable when a grammar element is
specified in a choice element.

As specified in the section "2.2 Menus", paragraph "Choice element": "If a
<grammar> element is specified in <choice>, then the external grammar is
used instead of an automatically generated grammar."

However, in such case it is not clear what value will be assigned in the
_dtmf special variable while executing an enumerate element.

Suggested text modification to "2.2.4 ENUMERATE":

"This specifier may refer to two special variables: _prompt is the choice's
prompt, and _dtmf is the choice's assigned DTMF sequence. **If no DTMF
sequence is assigned to the choice element or if a <grammar> element is
specified in <choice> then the _prompt variable is assigned the ECMAScript
undefined value.**"

Resolution: accepted with modifications

We accept the suggested text but will re-word it more precisely (e.g. '_dtmf' instead of '_prompt').

Email Trail:

Issue CR5-2

From Guillaume Berche

1- Precise semantics of id attribute of form and menu

The id attribute is optional according to the schema. However the
specifications do not seem to precise how the interpreter should handle
dialogs without specified id.

Suggested text modification to section "2.1 Forms":

"id         The optional name of the form. If specified, the form can be
referenced within the document or from another document. For instance <form
id="weather">, <goto next="#weather">. **If not specified, an internal name
is generated by the interpreter instead.**"

Suggested text modification to section "2.2 Menus":

"id The optional identifier of the menu. It allows the menu to be the target
of a <goto> or a <submit>. **If not specified, an internal name is generated
by the interpreter instead.**"

Resolution: rejected

If no explicit id is specified, then the developer is not interested in referring to the form or menu element. Whether or not the platform generates an internal name is a vendor-specific issue.

Email Trail:

Issue CR5-3

From Guillaume Berche

2- Precise that <value> should be ignored if the expression resolves to
ECMAScript undefined

There are cases where it is difficult to know whether a variable (such as
special variable as _dtmf) has a non-null value without writing an explicit
if statement. To avoid this, it would be convenient if value elements would
be silently ignored if their expressions resolved into the ECMAScript
undefined value (whereas references to undeclared variables would keep
throwing an error.semantic event).

Suggested text modification to section section "4.1.4 <value> Element":

"expr The ECMAScript expression which provides the text to render, or
resolves into a special variable such as _prompt or _dmtf as specified in
section "2.2 Menus" paragraph "Enumerate element". If the expression
resolves into the ECMAScript undefined value, then the value element is
silently ignored. However, if the expression refers to an undeclared
variable, then an error.semantic event is thrown."

Resolution: rejected

As pointed out, the developer can always write explicit code to check the value of variables. The value of providing a 'convenience' interpretation is not clear to us.

Email Trail:

Issue CR5-4

From Guillaume Berche

3- Precise the value of _prompt when an option has no nested CDATA

As specified in "2.3.1.3. Fields Using Option Lists": "The default
assignment is the CDATA content of the <option> element with leading and
trailing white space removed. If this does not exist, then the DTMF sequence
is used instead."

Since the value of the _prompt variable is computed from the CDATA content,
what values is assigned to the _prompt variable when no CDATA content is
available in an option element? If the undefined value is assigned to the
_prompt special variable, would a <value expr="_prompt"> element fail?

Suggested modification: "if no CDATA is available from the <option> or
<choice> element, then the _prompt special variable is assigned the
undefined ECMAScript value."

Resolution: rejected

Having considered various alternatives including your suggestion, the group felt that at this stage in the process it is better to leave the behavior undefined and thereby platform-specific. A later version of VoiceXML may provide a more optimal solution.

Email Trail:

Issue CR5-5

From Guillaume Berche

4- precise the semantics of the value attribute of option elements

Section "2.3.1.3. Fields Using Option Lists" specifies the following: "value
The string to assign to the field's form item variable when a user selects
this option, whether by speech or DTMF. The default assignment is the CDATA
content of the <option> element with leading and trailing white space
removed. If this does not exist, then the DTMF sequence is used instead. "

However, the DTMF sequence is optional according to the schema.
Consequently, it would be useful to precise the behavior if unspecified

Suggested text modification to section "2.3.1.3. Fields Using Option Lists":

"Each <option> element contains PCDATA that is used to generate a speech
grammar. This follows the grammar generation method described for <choice>
in Section 2.2. Attributes may be used to specify a DTMF sequence for each
option and to control the value assigned to the field's form item variable.
Each option should at least define a DTMF sequence through the dtmf
attribute or contain CDATA content specifying the matching speech element,
otherwise an error.badfetch event is thrown."

Resolution: accepted with modifications

We will modify the specification so that in the situation where neither CDATA content nor a dtmf sequence is specified, then the default for the value attribute is undefined and the form field item is not filled.

Email Trail:

Issue CR5-6

From Guillaume Berche

5- Precise the format of the _dtmf special variable.

Section "2.2 Menus", paragraph "Enumerate element" states that "specifier
may refer to two special variables: _prompt is the choice's prompt, and
_dtmf is the choice's assigned DTMF sequence." However it does not precise
how the DTMF sequence is formatted (whether there are white space delimiters
that makes the string suitable for direct inclusion within a speech prompt)

Suggested text modification to section "2.2 Menus", paragraph "Enumerate
element":
"_prompt is the choice's prompt, and _dtmf is the choice's assigned DTMF
sequence formatted as a string holding the DTMF keystrokes separated by
white spaces (making it suitable for inclusion within a speech prompt)"

Resolution: accepted with modifications

The specification will be modified so that the format of _dtmf is a normalized representation of the dtmf sequence (i.e. single whitespace between DTMF tokens).

Email Trail:

Issue CR5-7

From Guillaume Berche

6- Precise the semantics of the dtmf attribute of option elements

Suggested modification to section "2.3.1.3. Fields Using Option Lists":

"dtmf    An **optional** DTMF sequence for this option. It is equivalent to
a simple DTMF <grammar> and DTMF properties (Section 6.3.3) apply to
recognition of the sequence. Unlike DTMF grammars, whitespace is optional:
dtmf="123#" is equivalent to dtmf="1 2 3 #". **If unspecified, no DTMF
grammar is associated to this option, meaning that this option can not be
matched using a DTMF**"

Rationale: it would make sense to add an option similar to the menu's dtmf
attribute so that dtmf sequence is automatically generated. Without this
attribute, how would an VXML author prevent the automatic generation of DTMF
grammars that may override other grammars (such as links)?
In addition, we would also need to specify what happens if a specified
option's dtmf attributes overlaps an automatically assigned dtmf. Should
this throw an "error.semantic" event as for choice elements or should we
rather apply the default grammar precedence algorithm to select the matching
element?

Resolution: accepted with modifications

We accept the suggested modification to 2.3.1.3 concerning the description of the dtmf attribute based on an alternative rationale; namely, that this is good clarification independent of the new features you mentioned in your rationale.

Email Trail:

Issue CR5-8

From Guillaume Berche

7- Precise semantics of Clear element.

Section "5.3.3 CLEAR" states that "The <clear> element resets one or more
form items" However, the definition of the namelist attribute adds that
"this [i.e. the namelist] can include variable names other than form items"
Besides, in the case where the namelist includes variable names other than
form items, what is the variable scope in which the variable must be defined
to be cleared?

Since a Clear element is an executable which may be included in a catch
element, which variable scope does it targets? In other words, would the
reset of a non-form item variable target the anonymous, dialog, document or
application-level scope?
[In addition, the Clear element may be invoked outside of the FIA (such as
during the document initialization), in which the notion of active element
is not clear, so relying on the scope of the active element as the scope in
which a variable should be cleared is ambiguous.]

Suggested text modification to Section "5.3.3 CLEAR":
"The <clear> element resets one or more form items, and possibly other
variables which are not form items. For each specified variable name, the
variable is resolved in the closest enclosing scope of the currently active
element as described in section "5.1.3 Referencing Variables". To remove
ambiguity, each variable name in the namelist may be prefixed with a scope
name as described in section "5.1.3 Referencing Variables".

Once a declared variable has been identified as declared in a given scope S,
its value is assigned the ECMAScript undefined value. In addition, if the
variable name corresponds to a form item in scope S, then the form item's
prompt counter and event counters are reset."

Resolution: accepted with modifications

We accept that the clear element should be clarified as your text suggests. However, we will modify the wording so that (a) variable references are resolved relative to the current scope as described in section 5.1.3, and (b) in the case of initialization, variable references are handled the same as for other ECMAScript variables.

Email Trail:

Issue CR5-9

From Guillaume Berche

8- Precise that var name attribute does not support scope prefixes

Suggested text modification to section "5.3.1 VAR":
 "name        The name of the variable that will hold the result. **Unlike
the name attribute of assign element, this attribute should not contain dots
(and in particular a scope prefix). The scope in which the variable is
defined is determined from the position in the document at which the var
element is declared.**"

Resolution: accepted with modifications

We accept the suggestion but will modify the text style for consistency with the rest of the document.

Email Trail:

Issue CR5-10

From Guillaume Berche

9- Precise that the assign's name attribute does support scope prefixes

The scope in which a variable is resolved is currently not clear. The
accepted scope prefix in the name attribute is also not clear.

Suggested text modification to section "5.3.2 ASSIGN"

"name The name of the variable being assigned to. As specified in section
"5.1.2 Variable Scopes", the corresponding variable should have been
previously declared otherwise an error.semantic event is thrown. By default,
the scope in which the variable is resolved is the closest enclosing scope
of the currently active element. To remove ambiguity, the variable name may
be prefixed with a scope name as described in section "5.1.3 Referencing
Variables". Note however that the name must refer to a variable and can not
refer to a property of an ECMAScript object or can not be a complex
ECMAScript
expression."

Resolution: accepted with modifications

We accept the suggested text modification but not the final line beginning "Note however" since it is permissable to assign to the property of an object; the second example in 5.3.2 makes this clear - <assign name="document.mycost" expr="document.mycost+14"/>.

Email Trail:

Issue CR5-11

From Guillaume Berche

10- Precise evaluation order of log attributes versus nested text/value, and
constraints on attributes

Suggested modification to section "5.3.13 LOG":

"label An **optional** string which may be used, for example, to indicate
the purpose of the log.
expr An **optional** ECMAscript expression evaluating to a string.
"

"The <log> element may contain any combination of text (CDATA) and <value>
elements. The generated message consists of the concatenation of the
evaluation of the ECMAscript expression followed in their respective order
by the nested text and the string form of the value of the "expr" attribute
of the <value> elements."

Resolution: accepted with modifications

We accept the clarification of 'optional' but not the last paragraph describing the order of evaluation - the order is already specified as document order.

Email Trail:

Issue CR5-12

From Guillaume Berche

11- Precise ordering of anonymous grammar generated for dtmfterm

As specified in section "2.3.6. RECORD": "The <record> element contains a
'dtmfterm' attribute as a developer convenience. A 'dtmfterm' attribute with
the value 'true' is equivalent to the definition of a local DTMF grammar
which matches any DTMF input. "

However, it is legal to have nested grammars in a record element. For
instance, a DTMF grammar that matches only the # key. It is not clear which
grammar would match because the precedence is not described.

Suggested text modification to section "2.3.6. RECORD": "The <record>
element contains a 'dtmfterm' attribute as a developer convenience. A
'dtmfterm' attribute with the value 'true' is equivalent to the definition
of a local DTMF grammar which matches any DTMF input. Any nested grammar
element will have precedence over this anonymous local grammar (even though
usefulness of such nested grammar is not clear)."

Resolution: accepted with modifications

We accept the suggested clarification of 'dtmfterm' attribute, but reject the suggested priority order when both the attribute and local grammars are specified. That is, we maintain that the dtmfterm attribute has priority over local grammars. Developers who want full control can omit the dtmf attribute and write their own local grammar.

Email Trail:

Issue CR5-13

From Guillaume Berche

12- Precise the semantics of the timeout property for the record element

The specs currently state the following "A timeout interval is defined to
begin immediately after prompt playback (including the 'beep' tone if
defined) and its duration is determined by the 'timeout' property. If the
timeout interval is exceeded before recording begins, then a <noinput> event
is thrown. "

However, how the "recording begins" is not clearly defined. I would assume
that when the platform supports speech recognition during recording, the
recording begins as soon as speech is provided by the remote end. However
the specification is not clear on whether in this case the platform should
remove the silence from the end of the first beep prompt up to the first
recognised speech. It is not clear either whether background noise or music
should trigger beginning of recording. For platforms not supporting speech
recognition during recording I believe this timeout property should be
ignored.

Suggested text modification to section "2.3.6. RECORD":

"A timeout interval is defined to begin immediately after prompt playback
(including the 'beep' tone if defined) and its duration is determined by the
'timeout' property. If the timeout interval is exceeded before recording
begins, then a <noinput> event is thrown. When the platform supports
detection of silence, the recording begins as soon as leading silence
(following the 'beep' tone if defined) completes. Note that whether the
recording would include the leading silence is platform specific. For
platforms not supporting silence detection, this property is ignored and no
<noinput> even is ever raised during a recording."

Resolution: accepted with modifications

We believe that when recording begins is clearly defined: in Section 2.3.6, it states:

"A recording begins at the earliest after the playback of any prompts (including the 'beep' tone if defined). As an optimization, a platform may begin recording when the user starts speaking."

i.e. the recording may include initial silence, etc if the platform does not use the optimization (e.g. voice activity detection). With the optimization, the recording can begin with the user's speech. Whether music or other audio triggers voice activity detection is platform-specific. Note that this behavior applies independent of whether speech recognition is supported (while the recording and recognition processes use the same audio data stream, theese processes are independent and therefore their voice activity detection mechanism may be different).

The timeout interval is clearly defined: "A timeout interval is defined to begin immediately after prompt playback (including the 'beep' tone if defined) and its duration is determined by the 'timeout' property."

The timeout interval has an effect on both recording and recognition (which are logically independent).

For recording, the impact is specified in "If the timeout interval is exceeded before recording begins, then a <noinput> event is thrown." In the case of non-optimized recording, recording always begins after prompt playback, so <noinput> would never be thrown. With optimized recording, however, <noinput> may be thrown if no voice activity is detected before timeout interval elapses.

For recognition, the situation is more complex. We are modifying the specification (due to implementation report feedback) so that if recognition is supported during recording (this is an optional feature), then only non-local speech grammars are active. If a non-local speech grammar is matched by audio input, then execution is immediately transferred its enclosing element. This raises the issue of whether a <noinput> or <nomatch> could be thrown by the recognition process. A <noinput> could be generated if the timeout interval has elapsed. A <nomatch> could be generated if the audio triggers recognition but does not match the active grammar. Our belief is that throwing these events by the recognition process during recording is undesirable and not what VoiceXML authors expect. Consequently, we are considering clarifying the specification to make it clear that <noinput> and <nomatch> events are never thrown from the recognition process during recording.

Email Trail:

Issue CR5-14

From Guillaume Berche

13- Precise that maxtime record attribute is mandatory and has no defaults

Suggested text modification to section "2.3.6. RECORD":
"maxtime The maximum duration to record. **This attribute must be specified
as it has no default value. If not specified an error.badfetch event is
thrown.**"

Resolution: rejected

The default value of the maxtime attribute is already specified as platform-dependent (see Table 16).

Email Trail:

Issue CR5-15

From Guillaume Berche

14- Precise that if value is used outside of a prompt element it inherits
default prompt parameters

The prompt element defines that if its attributes are not specified, they
default to values specified by properties. However, for the value element,
the specification do not precise how default values are computed.

Suggested text addition to section "4.1.4 <value> Element":
"The manner in which the value attribute is played is controlled by the
surrounding speech synthesis markup in the case the expression resolves to a
string. In the case the expression resolves to a special variable such as
_prompt, then the prompt attributes are inherited from the enclosing element
of the definition of the referenced element.

If no surrounding prompt element nor SSML tag is available, then the default
attributes of a prompt element (such as bargein, timeout or language) are
applied.

Consequently, the two following constructions are equivalent.
<catch event="noinput">
  <value expr="'please retry'">
</catch>

<catch event="noinput">
  <prompt>
      <value expr="'please retry'">
  </prompt>
</catch>
"

Resolution: accepted with modifications

We accept that clarification is required but not the proposed modification. We will clarify in 4.1.2 that for cases where prompt content is specified without prompt element then attributes are defined as specified in table 33.

Email Trail:

Issue CR6-1

From Guillaume Berche

1- Precise that buffered non-matching DTMF are discarded when an ASR grammar
matches.

It is unclear in the specifications whether the following document

<form name="form1">
  <field>
     <grammar src="builtin:grammar/boolean"/>
     <grammar src="builtin:dtmf/digits?length=4"/>
  <field>
  <filled>
     <goto next="#form2">
  </filled>
</form>

<form name="form2">
  <field>
     <grammar src="builtin:dtmf/digits?length=1"/>
  <field>
  <filled>
     <prompt>thanks for the dtmf</prompt>
  </filled>
  <noinput>
     <prompt>DTMF was discarded</prompt>
  </noinput>
</form>

By pressing the 1 key and speaking "yes" and waiting for the input timeout.
Should the interpreter play the "thanks for the dtmf" prompt or the "DTMF
was discarded" prompt?

Suggested solution: specify that partially buffered data are flushed in case
of grammar match in another mode.

Resolution: accepted with modfications

We will modify the specification to make it clear that this is a platform-specific issue (i.e. platforms may differ in whether or not they discard buffered non-matching DTMF when an ASR grammar matches).

Email Trail:

Issue CR6-2

From Guillaume Berche

2a- Rationale for not accepting local ruleref in inline SRGS grammars?

Can you please provide rationale for not accepting ruleref elements with
pure fragment URLs? Why would this be rejected in grammars provided inline
in VXML documents? What is the reason driving this restriction and forcing
to use remote grammars for any grammar using private rules?

Resolution: accepted with modifications

This is probably a misunderstanding on both sides. In section 3.1.1.4, the paragraph beginning "When referencing an external grammar, the value of src attribute ...", describes which values for the src attribute are permitted and which are not (the last paragraph of this section). It makes no statement about inline grammars. In particular, "Local rule reference: a fragment-only URI is not permited. (See definition in Section 2.2.1 of [SRGS]). A fragment-only URI value for the src attribute causes an error.semantic event." is intended to indicate that it is not permitted to have a fragment-only URI value for the src attribute in a VoiceXML <grammar> element. The simplest clarification is to start the last paragraph of this section "**And** the following are the forms of rule reference defined by [SRGS] that are not supported in VoiceXML 2.0. ...". For <ruleref>s in inline grammars, it is possible to refer rules within the same grammar, or an external grammar. What is not possible is to reference rules within a different inline grammar in a VoiceXML document since the uri is then pointing at a VoiceXML document not a grammar document. We believed that is clearly implied by VoiceXML and SRGS (especially with the clarification above) and that a separate clarfication is not required.

Email Trail:

Issue CR6-4

From Guillaume Berche

3- Precise that when transitionning to a document (without fragment in the
URI) and the transitionned document has no form, then the interpreter exits

Rationale: it can not be requested that every document have at least a
dialog (because a root application may only define variables or links),
however when transitionning to a document (without specifying a dialog) and
this document has no dialog defined, then the execution stops.

Suggested modification to section "5.3.7 GOTO"

"If the form item, dialog or document to transition to is not valid (i.e.
the form item, dialog or document does not exist), an error.badfetch must be
thrown. Note that for errors which occur during a dialog or document
transition, the scope in which errors are handled is platform specific. For
errors which occur during form item transition, the event is handled in the
dialog scope. If the document to transition has no dialog defined (and no
specific dialog was specified), then the execution stops."

Resolution: rejected

We believe it is already precise: a document to transition to without dialog is not valid, so an error.badfetch is thrown as already stated in 5.3.7.

Email Trail:

Issue CR6-5

From Guillaume Berche

4- Precise Prompt selection algorithm when the Prompt element appears as
executable content.

It does not seem clear from the examples provided in section "4.1.6 Prompt
Selection" whether the "prompt tappering" mechanism is supposed to be
applied when a prompt element appears as executable content.

For instance in the following case:

<field ...>
<help>
   <prompt count="1"> prompt 1 </prompt>
   <prompt count="3"> prompt 2 </prompt>
   <goto next="#form2"/>
   <prompt count="4"> prompt 3 </prompt>

</help>
</field>

Which prompt should be heard when the prompt counter of the current form
item (the field in this same) is 4? Applying the algorithm described in
section "4.1.6 Prompt Selection" would result in having the "prompt 3"
speech text to be heard, however it would be very confusing from the VXML
author point of view because it would be expected that after the goto
element no more executable content would be executed as specified in
Appendix C in the definition of the "execute" term.

Suggested modification to section "4.1.6 Prompt Selection":

"Each input item, <initial>, and menu has an internal prompt counter that is
reset to one each time the form or menu is entered. Whenever the system uses
a prompt, its associated prompt counter is incremented. This is the
mechanism supporting tapered prompts within form item elements. **When a
prompt element is specified as executable content (e.g. inside a catch or
filled element) then its count element is ignored and all prompts contained
in this element as queued in document order)**"

Resolution: rejected

As stated in 5.3.5 the count attribute on prompts in executable content is meaningless.

Email Trail:

Issue CR6-6

From Guillaume Berche

5- Precise the value of name$.inputmode when a transfer is not interrupted
by user input

Suggested modification to "Table 22: <transfer> Shadow Variables"

"name$.inputmode     The input mode of the terminating command (dtmf or
voice) or **undefined if the transfer was not interrupted by a grammar
match**"

Resolution: accepted

We will apply the suggested modification.

Email Trail:

Issue CR6-7

From Guillaume Berche

6- Correct typo in example of Section "4.1.3 Audio Prompting"

The extension of the file should rather be .vxml to not introduce confusion.
"<goto next="./make_bid.html"/>"

Resolution: accepted

We will correct the typo.

Email Trail:

Issue CR6-8

From Guillaume Berche

7- Precise that alternate audio is recursive:

According to the schema, the following vxml fragment is legal

<prompt>
  <audio src="http://www.dummy.org/main.wav" >
    <audio src="http://www.dummy.org/alternate1.wav" >
        <audio src="http://www.dummy.org/alternate2.wav"/ >
    </audio>
  </audio>
</prompt>

Can you please confirm my understanding of the specification: I understand
that if both main.wav and alternate1.wav can not be played, but
alternate2.wav can be played, then alternate2.wav will be played and no
error will be thrown.

Resolution: accepted

Your understanding is correct. No modifications will be made to the text since we believe this is sufficiently clear already.

Email Trail:

Issue CR6-9

From Guillaume Berche

8- Precise behavior of submit if undeclared/unvalid variables are references
in submit's namelist attributes

The specifications section "5.3.8 SUBMIT" states the following

"The list of variables to submit. By default, all the named input item
variables are submitted. If a namelist is supplied, it may contain
individual variable references which are submitted with the same
qualification used in the namelist. Declared VoiceXML and ECMAScript
variables can be referenced."

It does not specify the expected behavior in case an undeclared variable or
an invalid variable name is referenced in the namelist attribute.

Suggested modification to section "5.3.8 SUBMIT":
"namelist          The list of variables to submit. By default, all the
named input item variables are submitted. If a namelist is supplied, it may
contain individual variable references which are submitted with the same
qualification used in the namelist. Declared VoiceXML and ECMAScript
variables can be referenced. **If an undeclared or invalid variable name is
referenced then an "error.semantic" event is thrown**"

Resolution: accepted with modifications

We will modify the specification to clarify that an error.semantic is thrown when an undeclared variable is referenced, including reference within the namelist of a submit element (as well as exit, return, and subdialog elements).

Email Trail:

Issue CR6-12

From Guillaume Berche

11- Typo in section "2.3.6. RECORD"

The second sentence of the extract below seems incomplete, I don't get the
impact of the timeout interval on having a record variable unfilled.

"If no audio is collected during execution of <record>, then the record
variable remains unfilled (note). This can occur, for example, when DTMF or
speech input is received during prompt playback or the timeout interval (if
the developer wants input during prompt playback to initiate recording, then
prompts should be placed in an immediately preceding <field> with a zero
timeout). "

Resolution: accepted

We will modify the text so that the second sentence reads "This can occur, for example, when DTMF or speech input is received during prompt playback or *before* the timeout interval *expires* ..."

Email Trail:

Issue CR6-13

From Guillaume Berche

12- Typo in "Last Call Disposition of Comments"

The table in section "2. Comments" has an invalid "disposition" content: all
items are marked as accepted whereas this is not the case.

Resolution: accepted

No action since this document will be replaced by a CR disposition of comments document.

Email Trail:

Issue CR7-1

From Max Froumentin

I would like to object that all the examples in VoiceXML2 come with an
XML declaration and a schemaLocation attribute. It makes the language
appear unneccesarily complex. The Hello World example would be much
simpler as:

<vxml xmlns="http://www.w3.org/2001/vxml" version="2.0">
  <form>
    <block>Hello World!</block>
  </form>
</vxml>

schemaLocation bothers me more than by just making the examples hard to 
read. It suggests that the declaration is mandatory (which the XMLSchema
refutes), or even that the use of the schema is.

Resolution: rejected

It is good practise to provide the XML declaration (even though it is not mandatory). Providing the schemaLocation allows documents to be validated automatically by various tools, although as you correctly point out neither the attribute nor schema are mandatory.

Email Trail:

Issue CR8-1

From Matt Porter

this has to do Guillaume's question...
with let me elaborate on an issue with <record> that i dont understand.  Given this dialog...
 
<?xml version="1.0" encoding="UTF-8"?> 
<vxml version="2.0">
<form>
   <record  name="msg" beep="true" maxtime="10s" finalsilence="4000ms" dtmfterm="true" type="audio/x-wav">
       <prompt timeout="5s">Record a message after the beep.</prompt>
       <noinput>
            I didn't hear anything, please try again.
       </noinput>
      </record>
</form>
</vxml>

 
what if the user does not say anything ( no audio is collected because of
silence detection or whatever ), but terminates the recording with a DTMF.  it
seems to me the "termchar" shadow variable should hold the key they pressed,
and the "noinput" event would still be thrown...is this correct?
 
The <record> section seems to need more clarification....

Resolution: rejected with modifications

If dtmfterm is set to true, recording is terminated when any dtmf key is pressed ("Any DTMF keypress matching an active grammar terminates recording") but if no audio has been collected, then the record variable is not filled ("If no audio is collected during execution of <record>, then the record variable remains unfilled.") and consequently no shadow variables are assigned. The FIA then applies as normal without a noinput event being thrown; in your example, the prompt would be read again and another attempt at recording initiated. This is analogous to the situation with complex grammar result which don't assign any values to form input item variables, but no noinput event is thrown and the FIA applies as normal. Finally, note that there may be information available in these situations via the application.lastresult$ as described in 5.1.5. We will modify the specification to make clearer that information may be available via the application.lastresult$ in these situations.

Email Trail:

Issue CR9-1

From John Voger

Under section 3.1.1.3 Grammar Weight.



The last paragraph contains


..... real speech and textual data on a paricular platform."


Please replace "paricular" with "particular"

Resolution: accepted

We will correct the typo.

Email Trail:

Issue CR10-1

From Philippe Le Hegaret

[ECMASCRIPT] 
        " Standard ECMA-262 ECMAScript Language Specification  ",
        Standard ECMA-262, December 1999.
        See http://www.ecma.ch/ecma1/STAND/ECMA-262.htm
        
should read


[ECMASCRIPT] 
        " Standard ECMA-262 ECMAScript Language Specification  ",
        Standard ECMA-262, December 1999.
        See
        http://www.ecma-international.org/publications/standards/ECMA-262.HTM

Resolution: accepted

We will update the reference.

Email Trail:

Issue CR11-1

From C. M. Sperberg-McQueen

1. Several complex type definitions in vxml.xsd have <choice> model
groups that contain a single particle consisting of a reference to a
group. For example:

   <xsd:complexType name="basic.event.handler" mixed="true">
     <xsd:choice minOccurs="0" maxOccurs="unbounded">
       <xsd:group ref="executable.content" />
     </xsd:choice>
     <xsd:attributeGroup ref="EventHandler.attribs" />
   </xsd:complexType>

Since the particle in the group executable.content is also a <choice>,
this content model becomes

   <xsd:choice minOccurs="0" maxOccurs="unbounded">
     <xsd:choice>
       <xsd:group ref="audio"/>
       <xsd:element ref="assign"/>
       <xsd:element ref="clear"/>
       ... ...
     </xsd:choice>
   </xsd:choice>

The outer <choice> is clearly redundant.  The complex type definition
can be simplified to:

   <xsd:complexType name="basic.event.handler" mixed="true">
     <xsd:group ref="executable.content" minOccurs="0" 
maxOccurs="unbounded" />
     <xsd:attributeGroup ref="EventHandler.attribs" />
   </xsd:complexType>

We think such a simplification makes the schema easier to follow and
we recommend the change.

Resolution: accepted

Change applied.

Email Trail:

Issue CR11-2

From C. M. Sperberg-McQueen

2. Some contents may usefully be constrained more tightly than the
schema now constrains them. For example, the <if> element is declared
as:

   <xsd:element name="if">
     <xsd:complexType mixed="true">
       <xsd:choice minOccurs="0" maxOccurs="unbounded">
         <xsd:group ref="executable.content" />
         <xsd:element ref="elseif" />
         <xsd:element ref="else" />
       </xsd:choice>
       <xsd:attributeGroup ref="If.attribs" />
     </xsd:complexType>
   </xsd:element>

Since there is no order or occurence constraint, instances such as the
following are all valid, which seems too flexible.

   <if>
     ...
     <else/>
     ...
     <else/>
     ...
     <elseif/>
     ...
   </if>

The content can be changed to the following to ensure that all
<elseif> elements occur before <else> and that there is no more than
one <else> element:

   <xsd:element name="if">
     <xsd:complexType mixed="true">
       <xsd:sequence>
         <xsd:group ref="executable.content minOccurs="0"
                    maxOccurs="unbounded" />
         <xsd:sequence minOccurs="0" maxOccurs="unbounded">
           <xsd:element ref="elseif" />
           <xsd:group ref="executable.content minOccurs="0"
                      maxOccurs="unbounded" />
         </xsd:sequence>
         <xsd:sequence minOccurs="0" maxOccurs="1">
           <xsd:element ref="else" />
           <xsd:group ref="executable.content minOccurs="0"
                      maxOccurs="unbounded" />
         </xsd:sequence>
       </xsd:sequence>
       <xsd:attributeGroup ref="If.attribs" />
     </xsd:complexType>
   </xsd:element>

(In passing, we note that on general principles, we believe the
language would be easier to describe and use if the 'elseif' and
'else' elements (and a 'then' element) were not empty elements
followed by appropriate executable content, but non-empty elements
which contained the appropriate executable content.  We recognize
that this may not be a feasible change at this stage in the life of
VoiceXML.)

Resolution: accepted

Change applied. We will look into changing the if-then-else structure in a future version of the language.

Email Trail:

Issue CR11-3

From C. M. Sperberg-McQueen

3. The element "output" in vxml.xsd is declared as abstract, and not
used or referenced anywhere else. The declaration may be removed.

Resolution: accepted

Element removed.

Email Trail:

Issue CR11-4

From C. M. Sperberg-McQueen

4. The VariableName.datatype in vxml-datatypes.xsd has a pattern:

   xsd:pattern value="['$'\c]+" />

The character '$' in the range doesn't need the quotation mark, and as
written the value will accept single quotation marks where a dollar
sign or \c is expected.  We suspect this is not intended.

Resolution: accepted

Change applied.

Email Trail:

Issue CR11-5

From C. M. Sperberg-McQueen

5. The ContentType.datatype in vxml-datatypes.xsd is defined as a list
of string. Since string may contain whitespaces, the definition should
perhaps be changed to a list of token; this is less subject to
misunderstanding by readers of the schema.

Resolution: accepted

Change applied.

Email Trail:

Issue CR11-6

From C. M. Sperberg-McQueen

6. According to the comments in the annotations,
VariableNames.datatype, RestrictedVariableNames.datatype, and
EventNames.datatype are lists of atomic VariableName.datatype,
RestrictedVariableName.datatype and EventNames.datatype
respectively. We believe they should be defined as such rather than as
NMTOKENS or other types:

   <xsd:simpleType name="RestrictedVariableNames.datatype">
     <xsd:annotation>
       <xsd:documentation>space separated list of restricted
         variable names </xsd:documentation>
     </xsd:annotation>
     <xsd:list itmeType="RestrictedVariableName.datatype"/>
   </xsd:simpleType>

   <xsd:simpleType name="VariableNames.datatype">
     <xsd:annotation>
       <xsd:documentation>space separated list of variable names
         including shadow variables</xsd:documentation>
     </xsd:annotation>
     <xsd:list itemType="VariableName.datatype">
   </xsd:simpleType>

   <xsd:simpleType name="EventNames.datatype">
     <xsd:annotation>
       <xsd:documentation>space separated list of
         EventName.datatype</xsd:documentation>
     </xsd:annotation>
     <xsd:list itmeType="EventName.datatype"/>
   </xsd:simpleType>

Resolution: accepted

Change applied.

Email Trail:

Issue CR11-7

From C. M. Sperberg-McQueen

7. Some suggestions for simple type Repeat-prob.datatype in
grammar-core.xsd:

a. The base type might better be made decimal instead of float. It
should be noted that decimal is not a subtype of float and their
mappings from the lexical space to the value space are different. For
example, '1.1' may be rounded to some float value different from
exactly 1.1. Such behavior is not expected in decimal.

b. The maxInclusive value is 1.0, while the patterns allow any
positive values less than 10. They should be made consistent.

c. The pattern ([0-9]+)? should probably be replaced with the
equivalent pattern [0-9]*.

Resolution: accepted

Changes applied.

Email Trail:

Issue CR11-8

From C. M. Sperberg-McQueen

8. The commented-out pattern constraint in
RestrictedVariableName.datatype in vxml.xsd needs to be removed or
fixed.

Resolution: accepted

Change applied.

Email Trail:

Issue CR12-1

From Guillaume Berche

1- precise behavior when only activated grammars are disabled by "inputmodes"
property

In the following example, what is the expected behavior? Should an
error.semantic be thrown as would if no grammar was activated as described in
section "3.1.4 Activation of Grammars"? Should the grammars considered rather
as activated but would not match as described in section "6.3.6 Miscellaneous
Properties" (inputmodes property) ", and thus lead to a nomatch event to be
thrown?


Section "3.1.4 Activation of Grammars" states that "If no grammars are active
when an input is expected, the platform must throw an error.semantic event".

Section "6.3.6 Miscellaneous Properties" states that "For instance, voice-only
grammars may be active when the inputmode is restricted to DTMF. Those
grammars would not be matched, however, because the voice input modality is
not active. "

<menu>
         <prompt>
       Choose wind speed and after temperature then finaly ask for leave choice test.
         </prompt>
     <choice next="#exacte_rain"> rain humidity </choice>
     <choice next="#approx_wind"> wind speed </choice>
     <choice next="#approx_weat">temperature celcius</choice>
     <choice next="#exacte_leave">Leave choice test </choice> </menu>

Suggested modification to Section "6.3.6 Miscellaneous Properties" (inputmodes
definition) "[..] For instance, voice-only grammars may be active when the
inputmode is restricted to DTMF. Those grammars would not be matched, however,
because the voice input modality is not active. If among all grammars active
none can be matched because their associated input modality is not enabled,
then a nomatch event is thrown."

Resolution: rejected

Your question is not very clear but given