This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
http://www.w3.org/International/2005/02/xpath-review.html Comment ID: 4
I don't understand this comment (or at any rate, the example given). If the schema requires an element to be empty, then it can't contain a non-breaking space. The spec (both the language book and the data model) spell out the difference between string value and typed value very clearly. The concept might be difficult to grasp, but it's hard to see how adding more words of explanation will help. Michael Kay (personal response)
(In reply to comment #0) > http://www.w3.org/International/2005/02/xpath-review.html Comment ID: 4 more words of explanation would not help, but an example similar to the one given might help. You could add that as a note.
(In reply to comment #2) > (In reply to comment #0) > > http://www.w3.org/International/2005/02/xpath-review.html Comment ID: 4 > > more words of explanation would not help, but an example similar to the one > given might help. You could add that as a note. I think that there is a misunderstanding in the original comment, and it is a misunderstanding that involves both schema validation and our spec. Let me explain, and I think it will be clear why adding an example won't help. Here's the original comment: > The value comparison relies on atomization of the values; if these > are nodes, the atomized value is returned as a typed value. You > should make clear that this is quite different from the comparison > of string values. This difference might be important for some i18n > applications. Consider the following example: > > <myEl1>bla<myEl2>Š</myEl2></myEl1> > > if there is a schema which declares the type of myEl2 as empty, > Š would not be part of the PSVI and the result of > > $myDoc/myEl1 eq "bla" > > would be true, otherwise it would be false. Actually it is not true that this comparison can sometimes return true, regardless of the schema (if any). There are two possibilities: 1. The element has a mixed type, is not schema validated, or schema validation was attempted but this element did not validate. In this case, the comparison can be made, the typed value is the string value of the node as an instance of xdt:untypedAtomic, and the typed value contains the Š character, so the comparison returns false. This is the case that you get wtih the schema you describe, since validation would fail (in XML Schema, a non-breaking space is not allowed in an empty element). 2. The element has a non-mixed type and schema validation succeeded. In this case, atomization fails, and an error is raised. The reason an example doesn't really help is that most people either aren't thinking about how to sneak in non-breaking space in a way that might cause this problem, or else they have tried it with a schema validator and learned that it breaks validation. The example you ask for would primarily help those who have thought about this example but never determined what XML Schema would do with it.
(In reply to comment #3) > (In reply to comment #2) > > (In reply to comment #0) > > > http://www.w3.org/International/2005/02/xpath-review.html Comment ID: 4 > > > > more words of explanation would not help, but an example similar to the one > > given might help. You could add that as a note. > > I think that there is a misunderstanding in the original comment, and it is a > misunderstanding that involves both schema validation and our spec. Let me > explain, and I think it will be clear why adding an example won't help. > > Here's the original comment: > > > The value comparison relies on atomization of the values; if these > > are nodes, the atomized value is returned as a typed value. You > > should make clear that this is quite different from the comparison > > of string values. This difference might be important for some i18n > > applications. Consider the following example: > > > > <myEl1>bla<myEl2>Š</myEl2></myEl1> > > > > if there is a schema which declares the type of myEl2 as empty, > > Š would not be part of the PSVI and the result of > > > > $myDoc/myEl1 eq "bla" > > > > would be true, otherwise it would be false. > > Actually it is not true that this comparison can sometimes return > true, regardless of the schema (if any). There are two possibilities: > > 1. The element has a mixed type, is not schema validated, or schema > validation was attempted but this element did not validate. In this > case, the comparison can be made, the typed value is the string value > of the node as an instance of xdt:untypedAtomic, and the typed value > contains the Š character, so the comparison returns false. This > is the case that you get wtih the schema you describe, since > validation would fail (in XML Schema, a non-breaking space is not > allowed in an empty element). > > 2. The element has a non-mixed type and schema validation > succeeded. In this case, atomization fails, and an error is raised. > > The reason an example doesn't really help is that most people either aren't > thinking about how to sneak in non-breaking space in a way that might cause this > problem, or else they have tried it with a schema validator and learned that it > breaks validation. The example you ask for would primarily help those who have > thought about this example but never determined what XML Schema would do with it. > > Let's use a different document instance: <person name="Dr.   No"/> and the query: string($myDoc/person/@name) eq "Dr. No" and a schema which defines the type of @name as a simple type with a white space facet, see http://www.w3.org/TR/xmlschema-2/#rf-whiteSpace If the value of the facet is "collapse", the result of the query will be true. If there is no schema, it will be false. Please correct me if this is a wrong example; I'm making it because white space is an important issue in i18n related processing, and it is important to make the influcence of typing to whitespace handling clear.
Yes, your example is now correct, and illustrates the point that you will indeed get different answers when testing the typed value of a node and when testing its string value. There are many other similar cases, for example if a node has type NMTOKENS, and has the string value "red green blue", then string(.)="red" will be false, while data(.)="red" will be true. I think you will find that XPath 2.0 tutorials, training courses, and textbooks spend some time explaining these concepts. The specification, of course, is not designed to be a tutorial. Michael Kay (personal response).
(In reply to comment #5) > Yes, your example is now correct, and illustrates the point that you will indeed > get different answers when testing the typed value of a node and when testing > its string value. There are many other similar cases, for example if a node has > type NMTOKENS, and has the string value "red green blue", then string(.)="red" > will be false, while data(.)="red" will be true. > > I think you will find that XPath 2.0 tutorials, training courses, and textbooks > spend some time explaining these concepts. The specification, of course, is not > designed to be a tutorial. > > Michael Kay (personal response). > > Of course the spec is not a tutorial, but there are many "notes" in specs. To my understanding one purpose of the notes is to clarify concepts which are well defined even without the note. The clarification might be helpful to reach a wider audience, not only of users, but also of implementers of XML-aware i18n tools. But you are right, that is not the main purpose of the spec, so I'll leave it to you / the WG how you will deal with that comment. Felix Sasaki (personal remark)