Copyright © 2006 W3C® (MIT , ERCIM , Keio), All Rights Reserved. W3C liability, trademark, document use, and software licensing rules apply.
This document details the responses made by the Voice Browser Working Group to issues raised during the Candidate Recommendation (beginning 13 June 2005 and ending 05 September 2006) review of Voice Extensible Markup Language (VoiceXML) Version 2.1. Comments were provided by Voice Browser Working Group members, other W3C Working Groups, and the public via the www-voice-request@w3.org (archive) mailing list.
This document of the W3C's Voice Browser Working Group describes the disposition of comments as of 5th September 2006 on the Candidate Recommendation of Voice Extensible Markup Language (VoiceXML) Version 2.1. It may be updated, replaced or rendered obsolete by other W3C documents at any time.
For background on this work, please see the Voice Browser Activity Statement.
This document describes the disposition of comments in relation to Voice Extensible Markup Language (VoiceXML) Version 2.1 (http://www.w3.org/TR/2005/CR-voicexml21-20050613/). Each issue is described by the name of the commentator, a description of the issue, and either the resolution or the reason that the issue was not resolved.
The full set of issues raised for the Voice Extensible Markup Language (VoiceXML) Version 2.1 since 13th June 2005, their resolution and in most cases the reasoning behind the resolution are available from http://www.w3.org/Voice/Group/2005/voicexml21-cr.html [W3C Members Only]. This document provides the analysis of the issues that were submitted and resolved as part of the Candidate Recommendation Review. It includes issues that were submitted outside the official review period, up to 21st October 2005.
Notation: Each original comment is tracked by a "(Change) Request" [R] designator. Each point within that original comment is identified by a point number. For example, "R5-1" is the first point in the fifth change request for the specification.
Item | Commentator | Nature | Disposition |
---|---|---|---|
R111-1 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted |
R111-2 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted (no reply) |
R111-3 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted (no reply) |
R111-4 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted |
R111-5 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Rejected |
R111-6 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Rejected |
R111-7 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted |
R111-8 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted (no reply) |
R111-9 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted (no reply) |
R111-10 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Rejected |
R111-11 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted (no reply) |
R111-12 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted (no reply) |
R111-13 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted |
R111-14 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Rejected |
R111-15 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Rejected |
R111-16 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted (no reply) |
R111-17 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted |
R111-18 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted |
R111-19 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Accepted |
R111-20 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Rejected |
R111-21 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Rejected |
R111-22 | Bjoern Hoehrmann | Clarification / Typo / Editorial | Rejected |
R113 | David Scarratt | Clarification / Typo / Editorial | Accepted (no reply) |
R114-1 | Andrew Hunt | Clarification / Typo / Editorial | Unknown |
R114-2 | Andrew Hunt | Clarification / Typo / Editorial | Unknown |
R114-3 | Andrew Hunt | Clarification / Typo / Editorial | Unknown |
R114-4 | Andrew Hunt | Clarification / Typo / Editorial | Unknown |
R115 | Andrew Hunt | Change to Existing Feature | Unknown |
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ appendix C.1 states "The DTD subset must not be used to override any parameter entities in the DTD." It's not clear why this is required, it's seems acceptable to modify e.g. ATTLIST declarations and other parts of the DTD and it is not required that VoiceXML 2.1 are valid XML documents; it's difficult to verify whether this requirement is satisfied, I would suggest to drop it.
Resolution: Accepted (w/modifications)
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ appendix C.1 states
[...] A conforming VoiceXML 2.1 document is a well-formed [XML] document that requires only the facilities described as mandatory in this specification and in [VXML2]. Such a document must meet all of the following criteria: [...]
It's not clear to me how a document can require facilities, what it means for a feature to be described as "mandatory" and the reference to VoiceXML 2.0 seems to imply that only the intersection of mandatory VoiceXML 2.0 and VoiceXML 2.1 facilities can be used, which seems to make little sense.
It is further not clear whether the statement above is the complete definition of conforming VoiceXML 2.1 documents, or whether the list that follows it specified additional requirements (or whether the list of requirements is equivalent to the statement).
Please change the document such that VoiceXML 2.1 document conformance is well-defined.
Resolution: Accepted
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ appendix C.3 states "When a Conforming VoiceXML 2.1 Processor encounters a non-Conforming VoiceXML 2.0 or 2.1 document, its behavior is undefined." This seems to contradict other sections, e.g. section 5 states for the <data> element
Exactly one of "src" or "srcexpr" must be specified; otherwise, an error.badfetch event is thrown.
Assuming that a document that violates this requirement is not con- forming, the document does define error processing. Please remove this contradiction.
Resolution: Accepted
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ has an appendix E.2 "Other References"; it should be "Informative References" for consistency with other Technical Reports like VoiceXML 2.0.
Resolution: Accepted
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ section 5 is unclear what happens if the src attribute value of the data element includes a fragment identifier; please specify processing in this case.
Resolution: Rejected
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ encourages use of non-standard media types like audio/x-wav and audio/x-alaw-basic. It is inappropriate for W3C specifications to encourage use of such types. I see three ways to resolve this, the requirement to support these types is removed, more suitable media types are identified and required instead of the non-standard ones, or W3C registeres the media types for use in the specification.
Resolution: Deferred
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ requires support for the multipart/form-data media type, but fails to include a normative reference to the definition of the type. Please add normative reference to RFC 2388 and refer to the reference from the relevant section(s).
Resolution: Accepted
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ states in section 5
If the name attribute is present, and the returned document is XML, the VoiceXML interpreter must expose the retrieved content via a read-only subset of the DOM as specified in Appendix D.
It's not clear how implementations must determine whether a document is XML. For example, if the document is a HTTP resource labeled with a media type, which media types would indicate that the content is "XML"? It seems a text/plain document would not be "XML", but a application/ xhtml+xml document would; please change the document such that it is clear which media types must and must not be considered "XML" by Voice- XML 2.1 implementations.
The document also notes
If the media type of the retrieved content is "text/xml" but the content is not well-formed XML, the interpreter throws error.badfetch.
It's not clear whether this also applies if the media type is e.g. application/xml, please change the document such that this is clarified.
Resolution: Accepted
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ is unclear about which media type to use for VoiceXML 2.1 content. VoiceXML 2.0 notes that application/voicexml+xml might be registered, but this is not clear from VoiceXML 2.1; or what the definition of the type would be. Please change VoiceXML 2.1 to clearly indicate appropriate media types and which of the types must be supported by implementations.
Resolution: Accepted
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ does not conform to http://www.w3.org/TR/2005/REC-charmod-20050215/ the W3C Character Model Recommendation. For example, VoiceXML 2.1 implementations are not re- quired to conform to the character model even though that it required for charmod conformance. I think all new W3C Technical Reports should comply with the Character Model Recommendation, please change VoiceXML 2.1 accordingly.
Resolution: Accepted (w/modifications)
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ notes that "The handling of a single application that mixes VoiceXML 2.0 and VoiceXML 2.1 functionality is platform-specific." So far it seems that VoiceXML 2.1 is a superset of VoiceXML 2.0, so it's not really clear how one could mix functionality. Please either remove this note or change the document such that it is clear what kind if functionality mix should be avoided by authors so they can avoid implementation-defined behavior.
Resolution: Accepted
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ defines a DOM Level 2 Core subset; it is unclear why DOM Level 2 has been chosen here, it seems DOM Level 3 Core should be referenced instead even if the feature set remains the same. Please change the draft to either refer to DOM 3 instead, or such that it is clear why DOM Level 2 is referenced.
Resolution: Rejected
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ has a normative reference to SSML that refers to a December 2002 Working Draft in the prose and links to December 2003 Candidate Recommendation. It seems http://www.w3.org/TR/2004/REC-speech-synthesis-20040907/ should be referenced instead, please change the document accordingly.
Resolution: Accepted
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ is unclear about which ECMAScript media types must be supported by VoiceXML 2.1 im- plementations that support external scripts delivered using a MIME- like protocol like HTTP. The only appropriate media type for such content is application/ecmascript; as this is a new media type and existing ECMAScript implementations do not interoperate well in this regard, other W3C Technical Reports require support for this type. I think VoiceXML 2.1 should do the same and require support for the application/ecmascript media type as defined in
http://www.ietf.org/internet-drafts/draft-hoehrmann-script-types-03.txt
Resolution: Rejected
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ inherits the <script> element defined in VoiceXML 2.0. The definition of that element includes a "charset" attribute which is defined as
The character encoding of the script designated by src. UTF-8 and UTF-16 encodings of ISO/IEC 10646 must be supported (as in [XML]) and other encodings, as defined in the [IANA], may be supported. The default value is UTF-8.
It's unclear when implementations would use the value of the attribute to decode external scripts. For example, it seems that implementations must ignore the attribute when the script is transported via a MIME- like mechanism like HTTP and the encoding is specified in the charset parameter in the Content-Type field. Please change VoiceXML such that VoiceXML 2.1 implementations process script content in a manner con- sistent with
http://www.ietf.org/internet-drafts/draft-hoehrmann-script-types-03.txt
and other applicable specifications.
Resolution: Accepted (w/modifications)
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ notes that
[...] Interpreters that support both VoiceXML 2.0 and VoiceXML 2.1 must support the ability to transition from an application of one version to an application of another version. [...]
This implies that VoiceXML 2.1 implementations are not required to support VoiceXML 2.0 documents, it's however not clear why. Please either change the document such that 2.1 implementations must also support 2.0 documents or such that it is explained why VoiceXML authors might find implementations that only support VoiceXML 2.1.
Resolution: Accepted
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ notes that 'It is recommended that the <vxml> element also include "xmlns:xsi" and "xsi:schemaLocation" attributes'; the document does that only for one example and I think "SHOULD" is much too strong here, I would suggest to turn that into a MAY or remove it; failing that, the examples in the document should be changed to comply with the re- quirement.
Resolution: Accepted
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ defines the RFC 2119 keywords in the SotD section. These should be in a normative part of the document, e.g. in a "Terminology" section as in XML 1.0.
Resolution: Accepted
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ uses namespace-un- aware DOM methods like getAttribute() and getElementsByTagName in some of the examples; the namespace-aware methods should be used instead as http://www.w3.org/TR/DOM-Level-3-Core/core.html#Namespaces-Considerations notes.
Resolution: Accepted
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ uses the .js file name extension in some of the examples; the extension is typically associated with JavaScript media types; as VoiceXML uses ECMAscript, the examples should use the .es extension that is typically associated with ECMAScript scripts as noted in
http://www.ietf.org/internet-drafts/draft-hoehrmann-script-types-03.txt
Resolution: Rejected
Email Trail:
From Bjoern Hoehrmann:
http://www.w3.org/TR/2005/CR-voicexml21-20050613/ defines a "read- only" subset of DOM Level 2 Core. The subset however is not defined to be read-only, for example, Attr.value is not read-only. Please change the document such that it is clear what implementations must do when setting such properties is attempted.
It seems they would raise a NO_MODIFICATION_ALLOWED_ERR exception, if that's the case it's unclear whether raising the exception is mandatory, even for implementations that support all of DOM Level 2 Core.
Resolution: Rejected
Email Trail:
From Bjoern Hoehrmann:
The organization of http://www.w3.org/TR/2005/CR-voicexml21-20050613/ as a set of extensions to VoiceXML 2.0 is problematic. Readers who are not familiar with VoiceXML 2.0 would have to consult two specifications to understand VoiceXML 2.1 and more experienced readers would have to know for each feature when it was introduced e.g. to point other people to the normative definition of an element.
It's also difficult to review the document, it is for example not always clear which VoiceXML 2.0 requirements also apply to 2.1 documents and implementations, and reviewers will likely fail to catch problems that are inherited from VoiceXML 2.0; for example, VoiceXML 2.0 references RFC 1521 and RFC 2396 normatively (both of which are obsolete now) but VoiceXML 2.1 does not seem to take that into account. The VoiceXML 2.0 errata is also empty.
This organization of a specification for a new version of a W3C tech- nology also caused some concern for other drafts, for example, SMIL 2.1 and SVG 1.2 were originally organized in a similar way like VoiceXML 2.1 but the Working Groups reconsidered this approach and future versions of these documents will be complete specifications.
I think VoiceXML 2.1 should be a complete specification rather than a set of extensions.
Resolution: Deferred
Email Trail:
From David Scarratt:
The type 'RestrictedVariableName.datatype' in vxml-datatypes.xsd changed from "xsd:NMTOKEN" to "([a-zA-Z]|[a-zA-Z$][a-zA-Z0-9_$]*[a-zA-Z0-9_])". It seems intended to implement some of the additional constraints listed in the accompanying 'xsd:documentation' section; but since the VoiceXML 2.0 specification indicates that "the variable naming convention is as in ECMAScript", which allows Unicode letters in names, while the new pattern apparently restricts variable names to a subset of ASCII, it breaks backwards compatibility.
Resolution: Accepted
Email Trail:
From Andrew Hunt:
Section 6 is titled "Concatenating Prompts Dynamically Using <foreach>". Whilst the use of <foreach> for prompt generation is clearly one important use case, the ability to use <foreach> as executable content (with or without prompt content) provides more general programmatic use. Suggestion: a general title such as "Iteration with <foreach>" (or similar) to capture the broader utility of this new element.
Resolution: Accepted
Email Trail:
From Andrew Hunt:
The table in Section 6 states that "array" is "An ECMAScript expression that must evaluate to an array; otherwise, an error.semantic event is thrown."
There are several ways to achieve array-like behaviour in ECMAScript and thus clarity in the spec will avoid some likely interop issues. Perhaps the intent is that the object must evaluate to an "Array" instance or merely that the object be able to dereference array[0], array[1] etc.
Suggestion: The specification should explicitly state the precise ECMAScript implementation (e.g. "Array") and ideally provide a canonical means of determining whether an object is an Array object. Note that typeof for an Array is "object" and thus it is necessary to look at the constructor or prototype verify Array-ness.
Resolution: Accepted
Email Trail:
From Andrew Hunt:
The table in Section 6 states that "item" is "The variable that stores each array item upon each iteration of the loop." The specification is mute on whether "store" is an "assignment by reference" where possible rather than "assignment by copy".
It should be noted that VoiceXML 2.0 is likewise mute on the same issue for <var> and <assign>.
However, the Implementation Report includes an example that requires "assignment by reference" in order to execute in the way implied by the test vector.
Suggestion: assignment by reference is fine but this behavior should be explicitly specified rather than implicit through an example in the IR.
Resolution: Accepted
Email Trail:
From Andrew Hunt:
The examples imply that array is assumed to have values set for indices of "0", "1", "2" and so on. The specification is mute on other legitimate ECMAScript array content. Are the following arrays semantic errors or acceptable but with undefined behavior?
Note from ECMA-262:
15.4 Array Objects Array objects give special treatment to a certain class of property names. A property name P (in the form of a string value) is an _array index_ if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 2^{32}-1. (88)
ECMAScript provides for:
- Sparse arrays: e.g. array with indices 0,1,2,10 (note that length is 11 for this example, not 4) - Non-zero index start: e.g. array with indices 3,4,5 - Array with no content
ECMAScript provides array-like behaviour for:
- Negative indices: e.g. array with indices -2,-1,0,1,2 - Non-integer indices: e.g. array with indices 0,1,2,3,"string",3.14 - Object indices: e.g. array with indices 0,1,2,3,ObjectX
Suggestion: document what we believe to be the intended spirit of the specification.
1. <foreach> iterates over content values for "array" starting with entry integer "0" and incrementing until an undefined entry is found.
2. If the array does not content an entry with index "0" no content will be executed and no error is raised
3. All values that are not sequential integers starting from "0" will be ignored without error
Note that because length is the max-index+1, iterating for (i=0; i<length; i++) will encounter undefined values in sparse arrays.
Resolution: Accepted
Email Trail:
None.
From Andrew Hunt:
Around 14 months ago there was an issue regarding <foreach> raised by Teemu Tingander including an alternate proposal. Ref: http://lists.w3.org/Archives/Public/www-voice/2005AprJun/0064
We agree with the concern raised by Teemu that the current normative Schema allows arbitrary executable content within a <foreach> contained within a <prompt> and that this creates some content that is potentially bizarre. e.g. how should one treat a <reprompt> buried within a <foreach> within <prompt> within a <menu> - likewise a <disconnect>, <exit>, <return>...?
That said, we recognize that there is value in allowing <foreach> within <prompt>. We propose continued support for <foreach> within <prompt> but with the following strict limitation. We believe that this limitation (a) permits the flexible prompting functionality implied by the introduction of <foreach>, (b) does not restrict the full programmatic use of <foreach> when used as executable content (i.e. not within a <prompt> element) and (c) eliminates the side-effects of the current Schema.
1. As per CR, maintain current <foreach> functionality when used as executable content including when <foreach> contains prompt and/or executable content
2. Permit <prompt> element to contain <foreach> elements with the constraint that...
3. A <foreach> element within <prompt> may contain only normal prompt content (strictly, "allowed within sentence" content) and may not contain executable content. The allowed elements are CDATA, <break>, <value>, <emphasis>, <mark>, <phoneme>, <prosody>, <say-as>, <sub>, <voice>, <sentence/s>, <paragraph/p>
4. A <foreach> element within a <prompt> cannot contain <prompt> elements or executable content.
We believe that this approach is Schema-enforceable, maintains the spirit of the <foreach> functionality, and will improve consistency of implementation of the standard.
[2] <foreach> and implicit prompts
VoiceXML 2.0 Rec (4.1.2) allows implicit <prompts> where (a) there is no need to specify a prompt attribute (like bargein), and (b) The prompt consists entirely of PCDATA, <audio> and <value> elements.
Perhaps it is worth stating explicitly in 2.1 that <foreach> as <prompt> content requires explicit <prompt> markup. For example,<block> <foreach item="i" array="myarray"> <audio expr="i.wav"/> </foreach> </block>is OK, because <block> can contain <foreach> as executable content and the <audio> can be interpreted as <prompt><audio></prompt>, but<field> <foreach item="i" array="myarray"> <audio expr="i.wav"/> </foreach> </field>is not OK, because <field> can't contain executable content and <foreach> is not one of the elements that implies <prompt>.
Resolution: Accepted
Email Trail:
None.