Copyright © 2004 W3C ® (MIT , ERCIM , Keio), All Rights Reserved. W3C liability, trademark, document use, and software licensing rules apply.
This document details the responses made by the Voice Browser Working Group to issues raised during the Candidate Recommendation (beginning 28th January 2003 and ending 10th April 2003) review of Voice Extensible Markup Language (VoiceXML) Version 2.0 . Comments were provided by Voice Browser Working Group members, other W3C Working Groups, and the public via the www-voice-request@w3.org (archive) mailing list.
This document of the W3C's Voice Browser Working Group describes the disposition of comment as of January 19, 2004 on Voice Extensible Markup Language (VoiceXML) Version 2.0 Candidate Recommendation. It may be updated, replaced or rendered obsolete by other W3C documents at any time.
For background on this work, please see the Voice Browser Activity Statement.
This document describes the disposition of comments in relation to the Voice Extensible Markup Language (VoiceXML) Version 2.0 (http://www.w3.org/TR/2003/CR-voicexml20-20030220/). Each issue is described by the name of the commentator, a description of the issue, and either the resolution or the reason that the issue was not resolved.
The full set of Issues raised for the Voice Extensible Markup Language (VoiceXML) Version 2.0 since August 2000, their resolution and in most cases the reasoning behind the resolution are available from http://www.w3.org/Voice/Group/2004/voicexml-change-requests.htm [W3C Members Only]. This document provides the analysis of the issues that were submitted and resolved as part of the Last Call Review.
Notation: Each original comment is tracked by a "(Change) Request" [R] designator. Each point within that original comment is identified by a point number. For example, "R5-1" is the first point in the fifth change request for the specification.
| Item | Commentator | Nature | Disposition |
| CR1-1 | Arnaud Vallee | Clarification / Typographical / Editorial (§2.1) | accepted (no-reply) |
| CR2-1 | Arnaud Vallee | Technical Error (§2.2) | accepted (no reply) |
| CR3-1 | Arnaud Vallee | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR4-1 | Arnaud Vallee | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR5-1 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-2 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-3 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-4 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-5 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-6 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-7 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-8 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-9 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-10 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-11 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-12 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-13 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-14 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-15 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR5-16 | Guillaume Berche | Change to Existing Feature (§2.3) | accepted |
| CR6-1 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR6-2 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR6-3 | Guillaume Berche | Change to Existing Feature (§2.3) | accepted |
| CR6-4 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR6-5 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR6-6 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR6-7 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR6-8 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR6-9 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR6-10 | Guillaume Berche | Technical Error (§2.2) | accepted |
| CR6-11 | Guillaume Berche | Technical Error (§2.2) | accepted |
| CR6-12 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR6-13 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR7-1 | Max Froumentin | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR8-1 | Matt Porter | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR9-1 | John Voger | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR10-1 | Philippe Le Hegaret | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR11-1 | C. M. Sperberg-McQueen | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR11-2 | C. M. Sperberg-McQueen | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR11-3 | C. M. Sperberg-McQueen | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR11-4 | C. M. Sperberg-McQueen | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR11-5 | C. M. Sperberg-McQueen | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR11-6 | C. M. Sperberg-McQueen | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR11-7 | C. M. Sperberg-McQueen | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR11-8 | C. M. Sperberg-McQueen | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR12-1 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR12-2 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR12-3 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR13-1 | Greg FitzPatrick | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR14-1 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR14-2 | Guillaume Berche | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR15-1 | Ufuk Kayserilioglu | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR16-1 | Mark Clark | Clarification / Typographical / Editorial (§2.1) | accepted (no reply) |
| CR17-1 | Robert Barkan | Clarification / Typographical / Editorial (§2.1) | accepted |
| CR18-1 | Mark Clark | Change to Existing Feature (§2.3) | accepted (no reply) |
| CR19-1 | Pavel Cenek | Feature Request (§2.4) | accepted |
| CR19-2 | Pavel Cenek | Feature Request (§2.4) | accepted |
From Arnaud Vallee
I have a question about where the error.badfetch is thrown and caught when a
called document has non existent root document.
Take the following scenario.
The document 1 makes a transition to document 2 whose root document does not
exist. document 1 and document 2 have error.badfetch handler at the document
level. Where is the error supposed to be caught?
I think the question could be the same for the following assertion: If a
document's application attribute refers to a document that also has an
application attribute specified, an error.semantic event is thrown.
As i did not get any anwer to the message, i post my query one more time.
The issue is as follows:
In a document named doc1.vxml, which is a root document (do not
specify an application attribute in the vxml tag), we transition to a
document doc2.vxml. doc2.vxml refers to a non existing root document
(i.e., application attribute set to doc2-root-unexisting.vxml).
As the spec says (chap 1.5.2), " If a document refers to a non-existent
application root document, an error.badfetch event is thrown ", an
error.badfetch is thrown in this case.
The question: where is the error thrown, or in other way, where do i put the
error.badfetch handler to catch the error?
I see 2 possibilities:
- in doc1.vxml, which means that if a document refers to a non existing root
document, it is a badfecth to try to get this document.
- in doc2.vxml, which means that current document has to be initialized before
getting and initializing the root document.
I think this is the same issue with the following assertion in chapter 1.5.2:
"If a document's application attribute refers to a document that also has an
application attribute specified, an error.semantic event is thrown. "
except that, in this case, the error.semantic could also be catched in the
first root document.
Analysis:
[Pavel Cenek] I am not member of WBWG, so my answer is only a guess. I also waited for an authorized answer and therefore haven't reacted on your first attempt. > The issue is as follows: > In a document named doc1.vxml, which is a root document (do not specify an application attribute in the vxml tag), we transition to a document doc2.vxml. > doc2.vxml refers to a non existing root document (i.e., application attribute set to doc2-root-unexisting.vxml). > > As the spec says (chap 1.5.2), > " If a document refers to a non-existent application root document, an error.badfetch event is thrown ", > an error.badfetch is thrown in this case. > > The question: where is the error thrown, or in other way, where do i put > the error.badfetch handler to catch the error? The transition is caused by <goto> or <submit>, etc, therefore I would apply the rules for these tags (which should be the same for all of them). For <goto>, spec says: "Note that for errors which occur during a dialog or document transition, the scope in which errors are handled is platform specific." > I see 2 possibilities: > - in doc1.vxml, which means that if a document refers to a non existing root document, it is a badfecth to try to get this document. In my opinion this possibility is more logical. > - in doc2.vxml, which means that current document has to be initialized before getting and initializing the root document. > I think this is the same issue with the following assertion in chapter 1.5.2: > "If a document's application attribute refers to a document that also has an application attribute specified, an error.semantic event is thrown. " I think it would be valuable to mention the citation above also in the chapter one.
Resolution: rejected
The specification allows the error.badfetch event to be thrown in either the referring document or the referred document. To guarantee that the error is caught, catch handlers need to be specified in both documents. This error handling pattern is illustrated in numerous tests in our implementation report.
Email Trail:
From Arnaud Vallee
chapter 2.4 of the VoiceXML (24 April 2002)
Attributes of filled are:
mode Either all (the default), or any. If any, this action is executed when
any of the specified input items is filled by the last user input. If
all, this action is executed when all of the mentioned input items are
filled, and at least one has been filled by the last user input. A
<filled> element in an input item cannot specify a mode.
namelist The input items to trigger on. For a <filled> in a form, namelist
defaults to the names (explicit and implicit) of the form's input
items. A <filled> element in an input item cannot specify a
namelist; the namelist in this case is the input item name. Note that
control items are not permitted in this list.
As i understand these attributes are not permitted in filled elements which
are child of input item. But the spec do not say what happens in this case:
- ignore those attributes?
- throw an error (semantic)?
Furthermore, control items items are not permitted in namelist. I suppose any
other ECMA variable are not permitted neither. But how a voice browser should
handle that case? Ignore the non-input variable elements or throw an error
(semantic)?
Resolution: accepted with modifications
The specification will be modified so that upon encountering a document containing a <filled> element specifying either a 'mode' or 'namelist' attribute as a child of an input item, then an error.badfetch is thrown by the platform. In addition, the specification will also make clear that an error.badfetch is thrown when the document contains a <filled> element with a namelist attribute referencing a control item variable.
Email Trail:
From Arnaud Vallee
The bargeintype propery is defined as follows: "speech: The prompt will be stopped as soon as speech or DTMF input is detected. The prompt is stopped irrespective of whether or not the input matches a grammar. " Would this mean that even if no dtmf grammar is active and the user enter a dtmf, the prompt should be stopped?
Resolution: accepted with modifications
Yes. If bargeintype is speech then the prompt will be stopped as soon as speech or DTMF input is detected regardless of if it is a match or not. Having dtmf grammars active or not does not effect this. Setting the inputmodes to voice should prevent the DTMF from barging in on the prompts (although some platforms may have difficulty separating in-band DTMF from speech). The specification will be clarified as follows: addition of the words "and irrespective of which grammars are active." to the end of the sentence "The prompt is stopped irrespective of whether or not the input matches a grammar" from table 38.
Email Trail:
From Guillaume Berche
0- Precise the value of the _dtmf special variable when a grammar element is specified in a choice element. As specified in the section "2.2 Menus", paragraph "Choice element": "If a <grammar> element is specified in <choice>, then the external grammar is used instead of an automatically generated grammar." However, in such case it is not clear what value will be assigned in the _dtmf special variable while executing an enumerate element. Suggested text modification to "2.2.4 ENUMERATE": "This specifier may refer to two special variables: _prompt is the choice's prompt, and _dtmf is the choice's assigned DTMF sequence. **If no DTMF sequence is assigned to the choice element or if a <grammar> element is specified in <choice> then the _prompt variable is assigned the ECMAScript undefined value.**"
Resolution: accepted with modifications
We accept the suggested text but will re-word it more precisely (e.g. '_dtmf' instead of '_prompt').
Email Trail:
From Guillaume Berche
1- Precise semantics of id attribute of form and menu The id attribute is optional according to the schema. However the specifications do not seem to precise how the interpreter should handle dialogs without specified id. Suggested text modification to section "2.1 Forms": "id The optional name of the form. If specified, the form can be referenced within the document or from another document. For instance <form id="weather">, <goto next="#weather">. **If not specified, an internal name is generated by the interpreter instead.**" Suggested text modification to section "2.2 Menus": "id The optional identifier of the menu. It allows the menu to be the target of a <goto> or a <submit>. **If not specified, an internal name is generated by the interpreter instead.**"
Resolution: rejected
If no explicit id is specified, then the developer is not interested in referring to the form or menu element. Whether or not the platform generates an internal name is a vendor-specific issue.
Email Trail:
From Guillaume Berche
2- Precise that <value> should be ignored if the expression resolves to ECMAScript undefined There are cases where it is difficult to know whether a variable (such as special variable as _dtmf) has a non-null value without writing an explicit if statement. To avoid this, it would be convenient if value elements would be silently ignored if their expressions resolved into the ECMAScript undefined value (whereas references to undeclared variables would keep throwing an error.semantic event). Suggested text modification to section section "4.1.4 <value> Element": "expr The ECMAScript expression which provides the text to render, or resolves into a special variable such as _prompt or _dmtf as specified in section "2.2 Menus" paragraph "Enumerate element". If the expression resolves into the ECMAScript undefined value, then the value element is silently ignored. However, if the expression refers to an undeclared variable, then an error.semantic event is thrown."
Resolution: rejected
As pointed out, the developer can always write explicit code to check the value of variables. The value of providing a 'convenience' interpretation is not clear to us.
Email Trail:
From Guillaume Berche
3- Precise the value of _prompt when an option has no nested CDATA As specified in "2.3.1.3. Fields Using Option Lists": "The default assignment is the CDATA content of the <option> element with leading and trailing white space removed. If this does not exist, then the DTMF sequence is used instead." Since the value of the _prompt variable is computed from the CDATA content, what values is assigned to the _prompt variable when no CDATA content is available in an option element? If the undefined value is assigned to the _prompt special variable, would a <value expr="_prompt"> element fail? Suggested modification: "if no CDATA is available from the <option> or <choice> element, then the _prompt special variable is assigned the undefined ECMAScript value."
Resolution: rejected
Having considered various alternatives including your suggestion, the group felt that at this stage in the process it is better to leave the behavior undefined and thereby platform-specific. A later version of VoiceXML may provide a more optimal solution.
Email Trail:
From Guillaume Berche
4- precise the semantics of the value attribute of option elements Section "2.3.1.3. Fields Using Option Lists" specifies the following: "value The string to assign to the field's form item variable when a user selects this option, whether by speech or DTMF. The default assignment is the CDATA content of the <option> element with leading and trailing white space removed. If this does not exist, then the DTMF sequence is used instead. " However, the DTMF sequence is optional according to the schema. Consequently, it would be useful to precise the behavior if unspecified Suggested text modification to section "2.3.1.3. Fields Using Option Lists": "Each <option> element contains PCDATA that is used to generate a speech grammar. This follows the grammar generation method described for <choice> in Section 2.2. Attributes may be used to specify a DTMF sequence for each option and to control the value assigned to the field's form item variable. Each option should at least define a DTMF sequence through the dtmf attribute or contain CDATA content specifying the matching speech element, otherwise an error.badfetch event is thrown."
Resolution: accepted with modifications
We will modify the specification so that in the situation where neither CDATA content nor a dtmf sequence is specified, then the default for the value attribute is undefined and the form field item is not filled.
Email Trail:
From Guillaume Berche
5- Precise the format of the _dtmf special variable. Section "2.2 Menus", paragraph "Enumerate element" states that "specifier may refer to two special variables: _prompt is the choice's prompt, and _dtmf is the choice's assigned DTMF sequence." However it does not precise how the DTMF sequence is formatted (whether there are white space delimiters that makes the string suitable for direct inclusion within a speech prompt) Suggested text modification to section "2.2 Menus", paragraph "Enumerate element": "_prompt is the choice's prompt, and _dtmf is the choice's assigned DTMF sequence formatted as a string holding the DTMF keystrokes separated by white spaces (making it suitable for inclusion within a speech prompt)"
Resolution: accepted with modifications
The specification will be modified so that the format of _dtmf is a normalized representation of the dtmf sequence (i.e. single whitespace between DTMF tokens).
Email Trail:
From Guillaume Berche
6- Precise the semantics of the dtmf attribute of option elements Suggested modification to section "2.3.1.3. Fields Using Option Lists": "dtmf An **optional** DTMF sequence for this option. It is equivalent to a simple DTMF <grammar> and DTMF properties (Section 6.3.3) apply to recognition of the sequence. Unlike DTMF grammars, whitespace is optional: dtmf="123#" is equivalent to dtmf="1 2 3 #". **If unspecified, no DTMF grammar is associated to this option, meaning that this option can not be matched using a DTMF**" Rationale: it would make sense to add an option similar to the menu's dtmf attribute so that dtmf sequence is automatically generated. Without this attribute, how would an VXML author prevent the automatic generation of DTMF grammars that may override other grammars (such as links)? In addition, we would also need to specify what happens if a specified option's dtmf attributes overlaps an automatically assigned dtmf. Should this throw an "error.semantic" event as for choice elements or should we rather apply the default grammar precedence algorithm to select the matching element?
Resolution: accepted with modifications
We accept the suggested modification to 2.3.1.3 concerning the description of the dtmf attribute based on an alternative rationale; namely, that this is good clarification independent of the new features you mentioned in your rationale.
Email Trail:
From Guillaume Berche
7- Precise semantics of Clear element. Section "5.3.3 CLEAR" states that "The <clear> element resets one or more form items" However, the definition of the namelist attribute adds that "this [i.e. the namelist] can include variable names other than form items" Besides, in the case where the namelist includes variable names other than form items, what is the variable scope in which the variable must be defined to be cleared? Since a Clear element is an executable which may be included in a catch element, which variable scope does it targets? In other words, would the reset of a non-form item variable target the anonymous, dialog, document or application-level scope? [In addition, the Clear element may be invoked outside of the FIA (such as during the document initialization), in which the notion of active element is not clear, so relying on the scope of the active element as the scope in which a variable should be cleared is ambiguous.] Suggested text modification to Section "5.3.3 CLEAR": "The <clear> element resets one or more form items, and possibly other variables which are not form items. For each specified variable name, the variable is resolved in the closest enclosing scope of the currently active element as described in section "5.1.3 Referencing Variables". To remove ambiguity, each variable name in the namelist may be prefixed with a scope name as described in section "5.1.3 Referencing Variables". Once a declared variable has been identified as declared in a given scope S, its value is assigned the ECMAScript undefined value. In addition, if the variable name corresponds to a form item in scope S, then the form item's prompt counter and event counters are reset."
Resolution: accepted with modifications
We accept that the clear element should be clarified as your text suggests. However, we will modify the wording so that (a) variable references are resolved relative to the current scope as described in section 5.1.3, and (b) in the case of initialization, variable references are handled the same as for other ECMAScript variables.
Email Trail:
From Guillaume Berche
8- Precise that var name attribute does not support scope prefixes Suggested text modification to section "5.3.1 VAR": "name The name of the variable that will hold the result. **Unlike the name attribute of assign element, this attribute should not contain dots (and in particular a scope prefix). The scope in which the variable is defined is determined from the position in the document at which the var element is declared.**"
Resolution: accepted with modifications
We accept the suggestion but will modify the text style for consistency with the rest of the document.
Email Trail:
From Guillaume Berche
9- Precise that the assign's name attribute does support scope prefixes The scope in which a variable is resolved is currently not clear. The accepted scope prefix in the name attribute is also not clear. Suggested text modification to section "5.3.2 ASSIGN" "name The name of the variable being assigned to. As specified in section "5.1.2 Variable Scopes", the corresponding variable should have been previously declared otherwise an error.semantic event is thrown. By default, the scope in which the variable is resolved is the closest enclosing scope of the currently active element. To remove ambiguity, the variable name may be prefixed with a scope name as described in section "5.1.3 Referencing Variables". Note however that the name must refer to a variable and can not refer to a property of an ECMAScript object or can not be a complex ECMAScript expression."
Resolution: accepted with modifications
We accept the suggested text modification but not the final line beginning "Note however" since it is permissable to assign to the property of an object; the second example in 5.3.2 makes this clear - <assign name="document.mycost" expr="document.mycost+14"/>.
Email Trail:
From Guillaume Berche
10- Precise evaluation order of log attributes versus nested text/value, and constraints on attributes Suggested modification to section "5.3.13 LOG": "label An **optional** string which may be used, for example, to indicate the purpose of the log. expr An **optional** ECMAscript expression evaluating to a string. " "The <log> element may contain any combination of text (CDATA) and <value> elements. The generated message consists of the concatenation of the evaluation of the ECMAscript expression followed in their respective order by the nested text and the string form of the value of the "expr" attribute of the <value> elements."
Resolution: accepted with modifications
We accept the clarification of 'optional' but not the last paragraph describing the order of evaluation - the order is already specified as document order.
Email Trail:
From Guillaume Berche
11- Precise ordering of anonymous grammar generated for dtmfterm As specified in section "2.3.6. RECORD": "The <record> element contains a 'dtmfterm' attribute as a developer convenience. A 'dtmfterm' attribute with the value 'true' is equivalent to the definition of a local DTMF grammar which matches any DTMF input. " However, it is legal to have nested grammars in a record element. For instance, a DTMF grammar that matches only the # key. It is not clear which grammar would match because the precedence is not described. Suggested text modification to section "2.3.6. RECORD": "The <record> element contains a 'dtmfterm' attribute as a developer convenience. A 'dtmfterm' attribute with the value 'true' is equivalent to the definition of a local DTMF grammar which matches any DTMF input. Any nested grammar element will have precedence over this anonymous local grammar (even though usefulness of such nested grammar is not clear)."
Resolution: accepted with modifications
We accept the suggested clarification of 'dtmfterm' attribute, but reject the suggested priority order when both the attribute and local grammars are specified. That is, we maintain that the dtmfterm attribute has priority over local grammars. Developers who want full control can omit the dtmf attribute and write their own local grammar.
Email Trail:
From Guillaume Berche
12- Precise the semantics of the timeout property for the record element The specs currently state the following "A timeout interval is defined to begin immediately after prompt playback (including the 'beep' tone if defined) and its duration is determined by the 'timeout' property. If the timeout interval is exceeded before recording begins, then a <noinput> event is thrown. " However, how the "recording begins" is not clearly defined. I would assume that when the platform supports speech recognition during recording, the recording begins as soon as speech is provided by the remote end. However the specification is not clear on whether in this case the platform should remove the silence from the end of the first beep prompt up to the first recognised speech. It is not clear either whether background noise or music should trigger beginning of recording. For platforms not supporting speech recognition during recording I believe this timeout property should be ignored. Suggested text modification to section "2.3.6. RECORD": "A timeout interval is defined to begin immediately after prompt playback (including the 'beep' tone if defined) and its duration is determined by the 'timeout' property. If the timeout interval is exceeded before recording begins, then a <noinput> event is thrown. When the platform supports detection of silence, the recording begins as soon as leading silence (following the 'beep' tone if defined) completes. Note that whether the recording would include the leading silence is platform specific. For platforms not supporting silence detection, this property is ignored and no <noinput> even is ever raised during a recording."
Resolution: accepted with modifications
We believe that when recording begins is clearly defined: in Section 2.3.6, it states:
"A recording begins at the earliest after the playback of any prompts (including the 'beep' tone if defined). As an optimization, a platform may begin recording when the user starts speaking."
i.e. the recording may include initial silence, etc if the platform does not use the optimization (e.g. voice activity detection). With the optimization, the recording can begin with the user's speech. Whether music or other audio triggers voice activity detection is platform-specific. Note that this behavior applies independent of whether speech recognition is supported (while the recording and recognition processes use the same audio data stream, theese processes are independent and therefore their voice activity detection mechanism may be different).
The timeout interval is clearly defined: "A timeout interval is defined to begin immediately after prompt playback (including the 'beep' tone if defined) and its duration is determined by the 'timeout' property."
The timeout interval has an effect on both recording and recognition (which are logically independent).
For recording, the impact is specified in "If the timeout interval is exceeded before recording begins, then a <noinput> event is thrown." In the case of non-optimized recording, recording always begins after prompt playback, so <noinput> would never be thrown. With optimized recording, however, <noinput> may be thrown if no voice activity is detected before timeout interval elapses.
For recognition, the situation is more complex. We are modifying the specification (due to implementation report feedback) so that if recognition is supported during recording (this is an optional feature), then only non-local speech grammars are active. If a non-local speech grammar is matched by audio input, then execution is immediately transferred its enclosing element. This raises the issue of whether a <noinput> or <nomatch> could be thrown by the recognition process. A <noinput> could be generated if the timeout interval has elapsed. A <nomatch> could be generated if the audio triggers recognition but does not match the active grammar. Our belief is that throwing these events by the recognition process during recording is undesirable and not what VoiceXML authors expect. Consequently, we are considering clarifying the specification to make it clear that <noinput> and <nomatch> events are never thrown from the recognition process during recording.
Email Trail:
From Guillaume Berche
13- Precise that maxtime record attribute is mandatory and has no defaults Suggested text modification to section "2.3.6. RECORD": "maxtime The maximum duration to record. **This attribute must be specified as it has no default value. If not specified an error.badfetch event is thrown.**"
Resolution: rejected
The default value of the maxtime attribute is already specified as platform-dependent (see Table 16).
Email Trail:
From Guillaume Berche
14- Precise that if value is used outside of a prompt element it inherits
default prompt parameters
The prompt element defines that if its attributes are not specified, they
default to values specified by properties. However, for the value element,
the specification do not precise how default values are computed.
Suggested text addition to section "4.1.4 <value> Element":
"The manner in which the value attribute is played is controlled by the
surrounding speech synthesis markup in the case the expression resolves to a
string. In the case the expression resolves to a special variable such as
_prompt, then the prompt attributes are inherited from the enclosing element
of the definition of the referenced element.
If no surrounding prompt element nor SSML tag is available, then the default
attributes of a prompt element (such as bargein, timeout or language) are
applied.
Consequently, the two following constructions are equivalent.
<catch event="noinput">
<value expr="'please retry'">
</catch>
<catch event="noinput">
<prompt>
<value expr="'please retry'">
</prompt>
</catch>
"
Resolution: accepted with modifications
We accept that clarification is required but not the proposed modification. We will clarify in 4.1.2 that for cases where prompt content is specified without prompt element then attributes are defined as specified in table 33.
Email Trail:
From Guillaume Berche
1- Precise that buffered non-matching DTMF are discarded when an ASR grammar
matches.
It is unclear in the specifications whether the following document
<form name="form1">
<field>
<grammar src="builtin:grammar/boolean"/>
<grammar src="builtin:dtmf/digits?length=4"/>
<field>
<filled>
<goto next="#form2">
</filled>
</form>
<form name="form2">
<field>
<grammar src="builtin:dtmf/digits?length=1"/>
<field>
<filled>
<prompt>thanks for the dtmf</prompt>
</filled>
<noinput>
<prompt>DTMF was discarded</prompt>
</noinput>
</form>
By pressing the 1 key and speaking "yes" and waiting for the input timeout.
Should the interpreter play the "thanks for the dtmf" prompt or the "DTMF
was discarded" prompt?
Suggested solution: specify that partially buffered data are flushed in case
of grammar match in another mode.
Resolution: accepted with modfications
We will modify the specification to make it clear that this is a platform-specific issue (i.e. platforms may differ in whether or not they discard buffered non-matching DTMF when an ASR grammar matches).
Email Trail:
From Guillaume Berche
2a- Rationale for not accepting local ruleref in inline SRGS grammars? Can you please provide rationale for not accepting ruleref elements with pure fragment URLs? Why would this be rejected in grammars provided inline in VXML documents? What is the reason driving this restriction and forcing to use remote grammars for any grammar using private rules?
Resolution: accepted with modifications
This is probably a misunderstanding on both sides. In section 3.1.1.4, the paragraph beginning "When referencing an external grammar, the value of src attribute ...", describes which values for the src attribute are permitted and which are not (the last paragraph of this section). It makes no statement about inline grammars. In particular, "Local rule reference: a fragment-only URI is not permited. (See definition in Section 2.2.1 of [SRGS]). A fragment-only URI value for the src attribute causes an error.semantic event." is intended to indicate that it is not permitted to have a fragment-only URI value for the src attribute in a VoiceXML <grammar> element. The simplest clarification is to start the last paragraph of this section "**And** the following are the forms of rule reference defined by [SRGS] that are not supported in VoiceXML 2.0. ...". For <ruleref>s in inline grammars, it is possible to refer rules within the same grammar, or an external grammar. What is not possible is to reference rules within a different inline grammar in a VoiceXML document since the uri is then pointing at a VoiceXML document not a grammar document. We believed that is clearly implied by VoiceXML and SRGS (especially with the clarification above) and that a separate clarfication is not required.
Email Trail:
From Guillaume Berche
3- Precise that when transitionning to a document (without fragment in the URI) and the transitionned document has no form, then the interpreter exits Rationale: it can not be requested that every document have at least a dialog (because a root application may only define variables or links), however when transitionning to a document (without specifying a dialog) and this document has no dialog defined, then the execution stops. Suggested modification to section "5.3.7 GOTO" "If the form item, dialog or document to transition to is not valid (i.e. the form item, dialog or document does not exist), an error.badfetch must be thrown. Note that for errors which occur during a dialog or document transition, the scope in which errors are handled is platform specific. For errors which occur during form item transition, the event is handled in the dialog scope. If the document to transition has no dialog defined (and no specific dialog was specified), then the execution stops."
Resolution: rejected
We believe it is already precise: a document to transition to without dialog is not valid, so an error.badfetch is thrown as already stated in 5.3.7.
Email Trail:
From Guillaume Berche
4- Precise Prompt selection algorithm when the Prompt element appears as executable content. It does not seem clear from the examples provided in section "4.1.6 Prompt Selection" whether the "prompt tappering" mechanism is supposed to be applied when a prompt element appears as executable content. For instance in the following case: <field ...> <help> <prompt count="1"> prompt 1 </prompt> <prompt count="3"> prompt 2 </prompt> <goto next="#form2"/> <prompt count="4"> prompt 3 </prompt> </help> </field> Which prompt should be heard when the prompt counter of the current form item (the field in this same) is 4? Applying the algorithm described in section "4.1.6 Prompt Selection" would result in having the "prompt 3" speech text to be heard, however it would be very confusing from the VXML author point of view because it would be expected that after the goto element no more executable content would be executed as specified in Appendix C in the definition of the "execute" term. Suggested modification to section "4.1.6 Prompt Selection": "Each input item, <initial>, and menu has an internal prompt counter that is reset to one each time the form or menu is entered. Whenever the system uses a prompt, its associated prompt counter is incremented. This is the mechanism supporting tapered prompts within form item elements. **When a prompt element is specified as executable content (e.g. inside a catch or filled element) then its count element is ignored and all prompts contained in this element as queued in document order)**"
Resolution: rejected
As stated in 5.3.5 the count attribute on prompts in executable content is meaningless.
Email Trail:
From Guillaume Berche
5- Precise the value of name$.inputmode when a transfer is not interrupted by user input Suggested modification to "Table 22: <transfer> Shadow Variables" "name$.inputmode The input mode of the terminating command (dtmf or voice) or **undefined if the transfer was not interrupted by a grammar match**"
Resolution: accepted
We will apply the suggested modification.
Email Trail:
From Guillaume Berche
6- Correct typo in example of Section "4.1.3 Audio Prompting" The extension of the file should rather be .vxml to not introduce confusion. "<goto next="./make_bid.html"/>"
Resolution: accepted
We will correct the typo.
Email Trail:
From Guillaume Berche
7- Precise that alternate audio is recursive:
According to the schema, the following vxml fragment is legal
<prompt>
<audio src="http://www.dummy.org/main.wav" >
<audio src="http://www.dummy.org/alternate1.wav" >
<audio src="http://www.dummy.org/alternate2.wav"/ >
</audio>
</audio>
</prompt>
Can you please confirm my understanding of the specification: I understand
that if both main.wav and alternate1.wav can not be played, but
alternate2.wav can be played, then alternate2.wav will be played and no
error will be thrown.
Resolution: accepted
Your understanding is correct. No modifications will be made to the text since we believe this is sufficiently clear already.
Email Trail:
From Guillaume Berche
8- Precise behavior of submit if undeclared/unvalid variables are references in submit's namelist attributes The specifications section "5.3.8 SUBMIT" states the following "The list of variables to submit. By default, all the named input item variables are submitted. If a namelist is supplied, it may contain individual variable references which are submitted with the same qualification used in the namelist. Declared VoiceXML and ECMAScript variables can be referenced." It does not specify the expected behavior in case an undeclared variable or an invalid variable name is referenced in the namelist attribute. Suggested modification to section "5.3.8 SUBMIT": "namelist The list of variables to submit. By default, all the named input item variables are submitted. If a namelist is supplied, it may contain individual variable references which are submitted with the same qualification used in the namelist. Declared VoiceXML and ECMAScript variables can be referenced. **If an undeclared or invalid variable name is referenced then an "error.semantic" event is thrown**"
Resolution: accepted with modifications
We will modify the specification to clarify that an error.semantic is thrown when an undeclared variable is referenced, including reference within the namelist of a submit element (as well as exit, return, and subdialog elements).
Email Trail:
From Guillaume Berche
11- Typo in section "2.3.6. RECORD" The second sentence of the extract below seems incomplete, I don't get the impact of the timeout interval on having a record variable unfilled. "If no audio is collected during execution of <record>, then the record variable remains unfilled (note). This can occur, for example, when DTMF or speech input is received during prompt playback or the timeout interval (if the developer wants input during prompt playback to initiate recording, then prompts should be placed in an immediately preceding <field> with a zero timeout). "
Resolution: accepted
We will modify the text so that the second sentence reads "This can occur, for example, when DTMF or speech input is received during prompt playback or *before* the timeout interval *expires* ..."
Email Trail:
From Guillaume Berche
12- Typo in "Last Call Disposition of Comments" The table in section "2. Comments" has an invalid "disposition" content: all items are marked as accepted whereas this is not the case.
Resolution: accepted
No action since this document will be replaced by a CR disposition of comments document.
Email Trail:
From Max Froumentin
I would like to object that all the examples in VoiceXML2 come with an
XML declaration and a schemaLocation attribute. It makes the language
appear unneccesarily complex. The Hello World example would be much
simpler as:
<vxml xmlns="http://www.w3.org/2001/vxml" version="2.0">
<form>
<block>Hello World!</block>
</form>
</vxml>
schemaLocation bothers me more than by just making the examples hard to
read. It suggests that the declaration is mandatory (which the XMLSchema
refutes), or even that the use of the schema is.
Resolution: rejected
It is good practise to provide the XML declaration (even though it is not mandatory). Providing the schemaLocation allows documents to be validated automatically by various tools, although as you correctly point out neither the attribute nor schema are mandatory.
Email Trail:
From Matt Porter
this has to do Guillaume's question...
with let me elaborate on an issue with <record> that i dont understand. Given this dialog...
<?xml version="1.0" encoding="UTF-8"?>
<vxml version="2.0">
<form>
<record name="msg" beep="true" maxtime="10s" finalsilence="4000ms" dtmfterm="true" type="audio/x-wav">
<prompt timeout="5s">Record a message after the beep.</prompt>
<noinput>
I didn't hear anything, please try again.
</noinput>
</record>
</form>
</vxml>
what if the user does not say anything ( no audio is collected because of
silence detection or whatever ), but terminates the recording with a DTMF. it
seems to me the "termchar" shadow variable should hold the key they pressed,
and the "noinput" event would still be thrown...is this correct?
The <record> section seems to need more clarification....
Resolution: rejected with modifications
If dtmfterm is set to true, recording is terminated when any dtmf key is pressed ("Any DTMF keypress matching an active grammar terminates recording") but if no audio has been collected, then the record variable is not filled ("If no audio is collected during execution of <record>, then the record variable remains unfilled.") and consequently no shadow variables are assigned. The FIA then applies as normal without a noinput event being thrown; in your example, the prompt would be read again and another attempt at recording initiated. This is analogous to the situation with complex grammar result which don't assign any values to form input item variables, but no noinput event is thrown and the FIA applies as normal. Finally, note that there may be information available in these situations via the application.lastresult$ as described in 5.1.5. We will modify the specification to make clearer that information may be available via the application.lastresult$ in these situations.
Email Trail:
From John Voger
Under section 3.1.1.3 Grammar Weight. The last paragraph contains ..... real speech and textual data on a paricular platform." Please replace "paricular" with "particular"
Resolution: accepted
We will correct the typo.
Email Trail:
From Philippe Le Hegaret
[ECMASCRIPT]
" Standard ECMA-262 ECMAScript Language Specification ",
Standard ECMA-262, December 1999.
See http://www.ecma.ch/ecma1/STAND/ECMA-262.htm
should read
[ECMASCRIPT]
" Standard ECMA-262 ECMAScript Language Specification ",
Standard ECMA-262, December 1999.
See
http://www.ecma-international.org/publications/standards/ECMA-262.HTM
Resolution: accepted
We will update the reference.
Email Trail:
1. Several complex type definitions in vxml.xsd have <choice> model
groups that contain a single particle consisting of a reference to a
group. For example:
<xsd:complexType name="basic.event.handler" mixed="true">
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:group ref="executable.content" />
</xsd:choice>
<xsd:attributeGroup ref="EventHandler.attribs" />
</xsd:complexType>
Since the particle in the group executable.content is also a <choice>,
this content model becomes
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:choice>
<xsd:group ref="audio"/>
<xsd:element ref="assign"/>
<xsd:element ref="clear"/>
... ...
</xsd:choice>
</xsd:choice>
The outer <choice> is clearly redundant. The complex type definition
can be simplified to:
<xsd:complexType name="basic.event.handler" mixed="true">
<xsd:group ref="executable.content" minOccurs="0"
maxOccurs="unbounded" />
<xsd:attributeGroup ref="EventHandler.attribs" />
</xsd:complexType>
We think such a simplification makes the schema easier to follow and
we recommend the change.
Resolution: accepted
Change applied.
Email Trail:
2. Some contents may usefully be constrained more tightly than the
schema now constrains them. For example, the <if> element is declared
as:
<xsd:element name="if">
<xsd:complexType mixed="true">
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:group ref="executable.content" />
<xsd:element ref="elseif" />
<xsd:element ref="else" />
</xsd:choice>
<xsd:attributeGroup ref="If.attribs" />
</xsd:complexType>
</xsd:element>
Since there is no order or occurence constraint, instances such as the
following are all valid, which seems too flexible.
<if>
...
<else/>
...
<else/>
...
<elseif/>
...
</if>
The content can be changed to the following to ensure that all
<elseif> elements occur before <else> and that there is no more than
one <else> element:
<xsd:element name="if">
<xsd:complexType mixed="true">
<xsd:sequence>
<xsd:group ref="executable.content minOccurs="0"
maxOccurs="unbounded" />
<xsd:sequence minOccurs="0" maxOccurs="unbounded">
<xsd:element ref="elseif" />
<xsd:group ref="executable.content minOccurs="0"
maxOccurs="unbounded" />
</xsd:sequence>
<xsd:sequence minOccurs="0" maxOccurs="1">
<xsd:element ref="else" />
<xsd:group ref="executable.content minOccurs="0"
maxOccurs="unbounded" />
</xsd:sequence>
</xsd:sequence>
<xsd:attributeGroup ref="If.attribs" />
</xsd:complexType>
</xsd:element>
(In passing, we note that on general principles, we believe the
language would be easier to describe and use if the 'elseif' and
'else' elements (and a 'then' element) were not empty elements
followed by appropriate executable content, but non-empty elements
which contained the appropriate executable content. We recognize
that this may not be a feasible change at this stage in the life of
VoiceXML.)
Resolution: accepted
Change applied. We will look into changing the if-then-else structure in a future version of the language.
Email Trail:
3. The element "output" in vxml.xsd is declared as abstract, and not used or referenced anywhere else. The declaration may be removed.
Resolution: accepted
Element removed.
Email Trail:
4. The VariableName.datatype in vxml-datatypes.xsd has a pattern: xsd:pattern value="['$'\c]+" /> The character '$' in the range doesn't need the quotation mark, and as written the value will accept single quotation marks where a dollar sign or \c is expected. We suspect this is not intended.
Resolution: accepted
Change applied.
Email Trail:
5. The ContentType.datatype in vxml-datatypes.xsd is defined as a list of string. Since string may contain whitespaces, the definition should perhaps be changed to a list of token; this is less subject to misunderstanding by readers of the schema.
Resolution: accepted
Change applied.
Email Trail:
6. According to the comments in the annotations,
VariableNames.datatype, RestrictedVariableNames.datatype, and
EventNames.datatype are lists of atomic VariableName.datatype,
RestrictedVariableName.datatype and EventNames.datatype
respectively. We believe they should be defined as such rather than as
NMTOKENS or other types:
<xsd:simpleType name="RestrictedVariableNames.datatype">
<xsd:annotation>
<xsd:documentation>space separated list of restricted
variable names </xsd:documentation>
</xsd:annotation>
<xsd:list itmeType="RestrictedVariableName.datatype"/>
</xsd:simpleType>
<xsd:simpleType name="VariableNames.datatype">
<xsd:annotation>
<xsd:documentation>space separated list of variable names
including shadow variables</xsd:documentation>
</xsd:annotation>
<xsd:list itemType="VariableName.datatype">
</xsd:simpleType>
<xsd:simpleType name="EventNames.datatype">
<xsd:annotation>
<xsd:documentation>space separated list of
EventName.datatype</xsd:documentation>
</xsd:annotation>
<xsd:list itmeType="EventName.datatype"/>
</xsd:simpleType>
Resolution: accepted
Change applied.
Email Trail:
7. Some suggestions for simple type Repeat-prob.datatype in grammar-core.xsd: a. The base type might better be made decimal instead of float. It should be noted that decimal is not a subtype of float and their mappings from the lexical space to the value space are different. For example, '1.1' may be rounded to some float value different from exactly 1.1. Such behavior is not expected in decimal. b. The maxInclusive value is 1.0, while the patterns allow any positive values less than 10. They should be made consistent. c. The pattern ([0-9]+)? should probably be replaced with the equivalent pattern [0-9]*.
Resolution: accepted
Changes applied.
Email Trail:
8. The commented-out pattern constraint in RestrictedVariableName.datatype in vxml.xsd needs to be removed or fixed.
Resolution: accepted
Change applied.
Email Trail:
From Guillaume Berche
1- precise behavior when only activated grammars are disabled by "inputmodes"
property
In the following example, what is the expected behavior? Should an
error.semantic be thrown as would if no grammar was activated as described in
section "3.1.4 Activation of Grammars"? Should the grammars considered rather
as activated but would not match as described in section "6.3.6 Miscellaneous
Properties" (inputmodes property) ", and thus lead to a nomatch event to be
thrown?
Section "3.1.4 Activation of Grammars" states that "If no grammars are active
when an input is expected, the platform must throw an error.semantic event".
Section "6.3.6 Miscellaneous Properties" states that "For instance, voice-only
grammars may be active when the inputmode is restricted to DTMF. Those
grammars would not be matched, however, because the voice input modality is
not active. "
<menu>
<prompt>
Choose wind speed and after temperature then finaly ask for leave choice test.
</prompt>
<choice next="#exacte_rain"> rain humidity </choice>
<choice next="#approx_wind"> wind speed </choice>
<choice next="#approx_weat">temperature celcius</choice>
<choice next="#exacte_leave">Leave choice test </choice> </menu>
Suggested modification to Section "6.3.6 Miscellaneous Properties" (inputmodes
definition) "[..] For instance, voice-only grammars may be active when the
inputmode is restricted to DTMF. Those grammars would not be matched, however,
because the voice input modality is not active. If among all grammars active
none can be matched because their associated input modality is not enabled,
then a nomatch event is thrown."
Resolution: rejected
Your question is not very clear but given