Bugzilla – Bug 14820
Implementation-defined features checklist
Last modified: 2012-01-06 23:45:29 UTC
In the (normative) appendix E of Part 1 listing implementation-defined features, the first entry reads:
For the datatypes defined by [XML Schema: Datatypes] which depend on [XML 1.1] or [XML Namespaces 1.1], it is
One of the changes in XML 1.1 was the introduction of Unicode NEL into the S production, so that whitespace is not in fact the same in XML 1.1; this was one of the reasons cited by vendors for not adopting 1.1.
The other significant remaining differences between 1.1 and 1.05e are
(1) handling of the version declaration;
(2) C0 control characters allowed in 1.1 and not in 1.0 (but must be escaped).
(character zero is still not allowed, however)
(3) C1 control characters must be escaped (to try to detect character encoding
The purpose of XML 1.0 5e was to bring the spec up to date with Unicode and to fix an error related to the version declaration, so 5e did not affect whitespace or C0/C1 control characters.
Hope this helps.
>One of the changes in XML 1.1 was the introduction of Unicode NEL into the S
production, so that whitespace is not in fact the same in XML 1.1
This is a surprisingly common belief even among experts. Perhaps it was true some draft of XML 1.1. In the actual Rec, NEL is a line ending character but it is not a whitespace character (production ). The set of line ending characters has changed, but XSD does not reference these. The set of whitespace characters has not changed.
>The other significant remaining differences between 1.1 and 1.05e are...
I don't believe any of these affect XSD, except perhaps the definition of xs:string:
Well, I see what you mean. It's comparable to the difference between lexical and value space in XSD: in XML 1.1, where the production S is used, a Unicode NEL (#x2028) can appear, whereas it would be an error in a 1.0 document. This is not done editorially by a direct change in the S production, but to the end of line handling. But it is still a change to the whitespace rules, and indirectly affects "S".
The difference starts to matter if other languages (e.g. XPath, XQuery) try to layer on top of XML to use "S" without including the line ending rules. However, I don't _think_ it matters for XSD, because an XML 1.1 parser would have been used to consume the input, and would have normalised NEL to #xA already.
So I wouldn't want to leave unchallenged the assertion that there's no change in whitespace rules between 1.0 and 1.1.
For xs:string, it might be worth adding a note, I agree.
(1) accept MK's change to the wording (possibly modulo xsd:string).
(from the original comment)
XSD 1.1 depends on other specifications for the definitions of some data types
such as xs:NCName and the definition of whitespace. Although all current
editions of [XML 1.0], [XML 1.1], [Namespaces in XML 1.0], and [XML Namespaces
1.1] have identical definitions of these constructs, future editions could
introduce changes. Implementations may support either the definitions in the
current specifications, or definitions in future specifications that supersede
them, or both.
[this wording leaves out a crucial bit concerning xs:string.]
(2) Add a Note to say that the mapping from the XML character sequence to the infoset which serves as input to the validator is, strictly speaking, not constrained by this spec, so either the 1.0 mapping or the 1.1 mapping can be used by a conforming processor.
A wording proposal adapted from MKay's original suggestion, with what I hope are minor changes to account for the difference in string, is at
The wording proposal mentioned in comment 5 was adopted by the XML Schema WG at its telcon today, and the change has been integrated into the status-quo version of the specification.
Michael, if you would formally close the issue, I'd be grateful. Thanks.