This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 1336 - please make value comparison clearer
Summary: please make value comparison clearer
Status: CLOSED INVALID
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XPath 2.0 (show other bugs)
Version: Last Call drafts
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Don Chamberlin
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-05-11 03:11 UTC by Felix Sasaki
Modified: 2005-09-29 10:49 UTC (History)
0 users

See Also:


Attachments

Description Felix Sasaki 2005-05-11 03:11:40 UTC
http://www.w3.org/International/2005/02/xpath-review.html Comment ID: 4
Comment 1 Michael Kay 2005-05-11 08:13:46 UTC
I don't understand this comment (or at any rate, the example given). If the
schema requires an element to be empty, then it can't contain a non-breaking space.

The spec (both the language book and the data model) spell out the difference
between string value and typed value very clearly. The concept might be
difficult to grasp, but it's hard to see how adding more words of explanation
will help.

Michael Kay (personal response)
Comment 2 Felix Sasaki 2005-05-18 18:48:22 UTC
(In reply to comment #0)
> http://www.w3.org/International/2005/02/xpath-review.html Comment ID: 4

more words of explanation would not help, but an example similar to the one
given might help. You could add that as a note.
Comment 3 Jonathan Robie 2005-05-19 17:30:27 UTC
(In reply to comment #2)
> (In reply to comment #0)
> > http://www.w3.org/International/2005/02/xpath-review.html Comment ID: 4
> 
> more words of explanation would not help, but an example similar to the one
> given might help. You could add that as a note.

I think that there is a misunderstanding in the original comment, and it is a
misunderstanding that involves both schema validation and our spec. Let me
explain, and I think it will be clear why adding an example won't help.

Here's the original comment:

> The value comparison relies on atomization of the values; if these
> are nodes, the atomized value is returned as a typed value. You
> should make clear that this is quite different from the comparison
> of string values. This difference might be important for some i18n
> applications. Consider the following example:
>
>   <myEl1>bla<myEl2>&#x160;</myEl2></myEl1>
>
> if there is a schema which declares the type of myEl2 as empty,
> &#x160; would not be part of the PSVI and the result of
>
> $myDoc/myEl1 eq "bla"
>
> would be true, otherwise it would be false.

Actually it is not true that this comparison can sometimes return
true, regardless of the schema (if any). There are two possibilities:

1. The element has a mixed type, is not schema validated, or schema
validation was attempted but this element did not validate. In this
case, the comparison can be made, the typed value is the string value
of the node as an instance of xdt:untypedAtomic, and the typed value
contains the &#x160; character, so the comparison returns false. This
is the case that you get wtih the schema you describe, since
validation would fail (in XML Schema, a non-breaking space is not
allowed in an empty element).

2. The element has a non-mixed type and schema validation
succeeded. In this case, atomization fails, and an error is raised.

The reason an example doesn't really help is that most people either aren't
thinking about how to sneak in non-breaking space in a way that might cause this
problem, or else they have tried it with a schema validator and learned that it
breaks validation. The example you ask for would primarily help those who have
thought about this example but never determined what XML Schema would do with it.

Comment 4 Felix Sasaki 2005-05-20 11:06:39 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > (In reply to comment #0)
> > > http://www.w3.org/International/2005/02/xpath-review.html Comment ID: 4
> > 
> > more words of explanation would not help, but an example similar to the one
> > given might help. You could add that as a note.
> 
> I think that there is a misunderstanding in the original comment, and it is a
> misunderstanding that involves both schema validation and our spec. Let me
> explain, and I think it will be clear why adding an example won't help.
> 
> Here's the original comment:
> 
> > The value comparison relies on atomization of the values; if these
> > are nodes, the atomized value is returned as a typed value. You
> > should make clear that this is quite different from the comparison
> > of string values. This difference might be important for some i18n
> > applications. Consider the following example:
> >
> >   <myEl1>bla<myEl2>&#x160;</myEl2></myEl1>
> >
> > if there is a schema which declares the type of myEl2 as empty,
> > &#x160; would not be part of the PSVI and the result of
> >
> > $myDoc/myEl1 eq "bla"
> >
> > would be true, otherwise it would be false.
> 
> Actually it is not true that this comparison can sometimes return
> true, regardless of the schema (if any). There are two possibilities:
> 
> 1. The element has a mixed type, is not schema validated, or schema
> validation was attempted but this element did not validate. In this
> case, the comparison can be made, the typed value is the string value
> of the node as an instance of xdt:untypedAtomic, and the typed value
> contains the &#x160; character, so the comparison returns false. This
> is the case that you get wtih the schema you describe, since
> validation would fail (in XML Schema, a non-breaking space is not
> allowed in an empty element).
> 
> 2. The element has a non-mixed type and schema validation
> succeeded. In this case, atomization fails, and an error is raised.
> 
> The reason an example doesn't really help is that most people either aren't
> thinking about how to sneak in non-breaking space in a way that might cause this
> problem, or else they have tried it with a schema validator and learned that it
> breaks validation. The example you ask for would primarily help those who have
> thought about this example but never determined what XML Schema would do with it.
> 
> 

Let's use a different document instance:

<person name="Dr.&#x20;&#x20;&#x20;No"/>

and the query:

string($myDoc/person/@name) eq "Dr. No"

and a schema which defines the type of @name as a simple type with a white space
facet, see
http://www.w3.org/TR/xmlschema-2/#rf-whiteSpace

If the value of the facet is "collapse", the result of the query will be true.
If there is no schema, it will be false. Please correct me if this is a wrong
example; I'm making it because white space is an important issue in i18n related
processing, and it is important to make the influcence of typing to whitespace
handling clear.
Comment 5 Michael Kay 2005-05-20 13:50:46 UTC
Yes, your example is now correct, and illustrates the point that you will indeed
get different answers when testing the typed value of a node and when testing
its string value. There are many other similar cases, for example if a node has
type NMTOKENS, and has the string value "red green blue", then string(.)="red"
will be false, while data(.)="red" will be true. 

I think you will find that XPath 2.0 tutorials, training courses, and textbooks
spend some time explaining these concepts. The specification, of course, is not
designed to be a tutorial.

Michael Kay (personal response).

Comment 6 Felix Sasaki 2005-05-20 16:05:20 UTC
(In reply to comment #5)
> Yes, your example is now correct, and illustrates the point that you will indeed
> get different answers when testing the typed value of a node and when testing
> its string value. There are many other similar cases, for example if a node has
> type NMTOKENS, and has the string value "red green blue", then string(.)="red"
> will be false, while data(.)="red" will be true. 
> 
> I think you will find that XPath 2.0 tutorials, training courses, and textbooks
> spend some time explaining these concepts. The specification, of course, is not
> designed to be a tutorial.
> 
> Michael Kay (personal response).
> 
> 

Of course the spec is not a tutorial, but there are many "notes" in specs. To my
understanding one purpose of the notes is to clarify concepts which are well
defined even without the note. The clarification might be helpful to reach a
wider audience, not only of users, but also of implementers of XML-aware i18n
tools. But you are right, that is not the main purpose of the spec, so I'll
leave it to you / the WG how you will deal with that comment.

Felix Sasaki (personal remark)