This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3224 - Inequality
Summary: Inequality
Status: CLOSED FIXED
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Datatypes: XSD Part 2 (show other bugs)
Version: 1.1 only
Hardware: PC Windows XP
: P4 minor
Target Milestone: ---
Assignee: C. M. Sperberg-McQueen
QA Contact: XML Schema comments list
URL:
Whiteboard: cluster: equality
Keywords: resolved
Depends on:
Blocks:
 
Reported: 2006-05-09 09:43 UTC by Michael Kay
Modified: 2008-02-29 20:29 UTC (History)
0 users

See Also:


Attachments

Description Michael Kay 2006-05-09 09:43:20 UTC
QT approved comment:

In the second paragraph of 2.2, what does the operator "<>" mean? (Not
equal? Not comparable?) Is it the same as the "/=" operator used in the
second paragraph of the subsequent Note? Reading on, I see that the
operators are defined later. A forward reference to the definition would be
useful.

In 2.2.2 and 2.2.3, the term "unequal" is used, but I can't find a
definition. In particular, it's not clear whether non-comparable pairs of
values are considered unequal.
Comment 1 Dave Peterson 2006-05-09 18:04:44 UTC
(In reply to comment #0)

> In the second paragraph of 2.2,

I believe you mean 2.2.2.

>                  what does the operator "<>" mean? (Not
> equal? Not comparable?) Is it the same as the "/=" operator used in the
> second paragraph of the subsequent Note?

They are not quite the same, but in this case '<> NaN' should be replaced by
'not equal to itself'.

> Reading on, I see that the
> operators are defined later. A forward reference to the definition would be
> useful.

I'm not sure it's necessary for a note.  There is nothing special about the use
of '=' for equality; the final paragraph just confirms that the symbol means
what it usually means.

> In 2.2.2 and 2.2.3, the term "unequal" is used, but I can't find a
> definition. In particular, it's not clear whether non-comparable pairs of
> values are considered unequal.

In common usage, "unequal" means "not equal".  Do you really think this needs
elaboration?  I suppose the word could be eliminated everywhere, but is
that worth the effort?

Given that, "[Definition:]  Two values that are neither equal, less-than, nor
greater-than are incomparable" surely implies that incomparable values are
not equal, hence unequal.
Comment 2 Michael Kay 2006-05-09 19:17:09 UTC
>> In the second paragraph of 2.2,
>I believe you mean 2.2.2.

All my numbering, I'm afraid, was based on section numbers in the "diff" version of the document. It didn't occur to me that these might be different from the "top copy".

>> what does the operator "<>" mean? (Not
>> equal? Not comparable?) Is it the same as the "/=" operator used in the
>> second paragraph of the subsequent Note?

>They are not quite the same, but in this case '<> NaN' should be replaced by
>'not equal to itself'.

The real point here is editorial, I think. The operators /= (not equal) and <> (incomparable) are defined in 2.2.3, but used in 2.2.2 without explanation.

>In common usage, "unequal" means "not equal".  Do you really think this needs
>elaboration?

There are two possible meanings here: "comparable and not equal", or simply "not equal". It's not obvious to me from the context which meaning is intended. One can probably work out from the context which of these two meanings is intended, but in a section where two different operators /= and <> are introduced, it's scary to use a word "unequal" that isn't explicitly bound to either of them.

Perhaps I'm conditioned by XPath 2.0, where A=B has three possible outcomes: equal (true), not equal (false), and incomparable (error). But many of your readers will also be conditioned by XPath.

Comment 3 Dave Peterson 2006-09-18 01:45:20 UTC
(In reply to comment #2)
> >> In the second paragraph of 2.2,
> >I believe you mean 2.2.2.
> 
> All my numbering, I'm afraid, was based on section numbers in the "diff"
> version of the document. It didn't occur to me that these might be different
> from the "top copy".

Well, it's 2.2.2 in the diff version.

> >> what does the operator "<>" mean? (Not
> >> equal? Not comparable?) Is it the same as the "/=" operator used in the
> >> second paragraph of the subsequent Note?
> 
> >They are not quite the same, but in this case '<> NaN' should be replaced by
> >'not equal to itself'.
> 
> The real point here is editorial, I think. The operators /= (not equal) and <>
> (incomparable) are defined in 2.2.3, but used in 2.2.2 without explanation.

Well, "<>" does not belong in 2.2.2, and as I said, it needs to be removed.

> >In common usage, "unequal" means "not equal".  Do you really think this needs
> >elaboration?
> 
> There are two possible meanings here: "comparable and not equal", or simply
> "not equal". It's not obvious to me from the context which meaning is intended.

Having eliminated the spurious use of '<>' in this section, "comparability" shouldn't arise.

> One can probably work out from the context which of these two meanings is
> intended, but in a section where two different operators /= and <> are
> introduced, it's scary to use a word "unequal" that isn't explicitly bound to
> either of them.

Well, with '<>' gone from the section on equality and explained in the section on order, I hope we no longer have a problem.

> Perhaps I'm conditioned by XPath 2.0, where A=B has three possible outcomes:
> equal (true), not equal (false), and incomparable (error). But many of your
> readers will also be conditioned by XPath.

I hope that pointing out that '?' (not equal) is just the negation of equality, I trust it's clear that equality is a binary relation.  Similarly I hope that the explanation of '<>' in 2.2.3 makes it clear that in no case do we expect a third "error" outcome.
Comment 4 C. M. Sperberg-McQueen 2006-09-22 13:14:49 UTC
Thank you for the comment; it's so easy to overlook these things that outside
eyes are extremely useful.

The best thing I can think of, to resolve this, is to replace paragraph
2 of section 2.2.2, which currently reads:

    On the other hand, equality need not cover the entire value space
    of the datatype (though it usually does). In particular, NaN <>
    NaN in the precisionDecimal, float, and double datatypes.

with: 

    On the other hand, equality need not cover the entire value space
    of the datatype (though it usually does). In particular, NaN is
    not equal to NaN in the precisionDecimal, float, and double
    datatypes.

I have sketched out various alternative wordings that say explicitly
that NaN is not comparable to NaN, as well as being unequal, but 
have not been satisfied with them:  they have the same flaw you originally
pointed out (namely, they rely on a technical term defined later) and
I was not able to find wording that did not invite the question "and
why are you telling me this?"  

Of course, the reason for telling the reader that NaN is not comparable
to NaN is that one reader asked about it.  But re-reading your comments,
I wonder if talking about incomparability here would actually give the
correct insight:  incomparability, as the XSD 1.1 spec uses the term,
does not entail, or correspond to, an error in the comparison -- it's just
a relation on values, like other relations.  It is difficult, if not
impossible, to construct an example of comparisons across primitive
datatypes in a schema, and if one did succeed the spec does not prescribe
an error over the form of the comparison, only the result 'false'.

So with some regret I am proposing to leave the second part of your
comment unaddressed, although I believe on rereading the relevant parts of
the spec that the substantive question you raise is in fact answered by
the definition of comparability in section 2.2.3.

N.B. the Working Group has not yet taken action on this proposal.
Comment 5 Dave Peterson 2007-04-09 15:27:28 UTC
(In reply to comment #4)

> N.B. the Working Group has not yet taken action on this proposal.

The WG asked for a Note explaining the situation.  The following Note will be presented to the WG:

   Note:  This specification explicitly defines equality and "strict"
   order (less-than and greater-than) and by extension defines inequality,
   "weak order" (less-than-or-equal and greater than or equal), and
   incomparability (neither less-than, greater-than, nor equal).  It
   is up to the using application to decide if any attempt to compare
   any values will cause an error; as far as this specification's
   definitions are concerned, no errors result from comparisons--only
   true or false results.
Comment 6 David Ezell 2007-04-20 15:33:53 UTC
Dave P. suggested the following change from MSM's original suggestion:

On the other hand, equality need not cover the entire value space
of the datatype (though it usually does). In particular, NaN is
not equal to itself in the precisionDecimal, float, and double
datatypes.
Comment 7 David Ezell 2007-04-20 15:38:11 UTC
The editors have suggested the following resolution.

Introduce the following wording in section 2.2.2:

On the other hand, equality need not cover the entire value space
of the datatype (though it usually does). In particular, NaN is
not equal to itself in the precisionDecimal, float, and double
datatypes.

And put the following wording at the end of section 3:

Note:  This specification explicitly defines equality and "strict"
order (less-than and greater-than) and by extension defines inequality,
"weak order" (less-than-or-equal and greater than or equal), and
incomparability (neither less-than, greater-than, nor equal).  It
is up to the using application to decide if any attempt to compare
any values will cause an error; as far as this specification's
definitions are concerned, no errors result from comparisons--only
true or false results.
Comment 8 Michael Kay 2007-04-20 16:57:40 UTC
>no errors result from comparisons--only true or false results.

This might be true in a rather strict sense, but it seems a misleading statement. If you have a restriction of xs:integer, and define the maxInclusive value as "2007-04-20", you get an error. You can argue that this isn't an error that "results from a comparison" - rather, that the system refused to do a comparison; but in the ordinary sense of the word the error has occurred because 23 and 2007-04-20 cannot be compared (that is, they are incomparable).

In fact, you seem to be using "incomparable" to describe a quite different relationship: a relationship where two values *can* be compared, and the outcome of the comparison is a value akin to "unknown".

Furthermore, people are unlikely to read the phrase "as far as this specification is concerned" as excluding comparisons done within an assertion, where the XPath comparison semantics apply - and these can certainly result in errors.
Comment 9 Dave Peterson 2007-04-20 18:18:44 UTC
(In reply to comment #8)
> >no errors result from comparisons--only true or false results.
> 
> This might be true in a rather strict sense, but it seems a misleading
> statement. If you have a restriction of xs:integer, and define the maxInclusive
> value as "2007-04-20", you get an error. You can argue that this isn't an error
> that "results from a comparison" - rather, that the system refused to do a
> comparison; but in the ordinary sense of the word the error has occurred
> because 23 and 2007-04-20 cannot be compared (that is, they are incomparable).

Not that they cannot be compared.  In fact, they can be compared, and the answers are that 23 <> 2007-04-20 is true, and 23 = 2007-04-20, 23 < 2007-04-20, and 23 > 2007-04-20 are all three false.  Of the derived comparisons, 23 != 2007-04-20 is true, and 23 <= 2007-04-20 and 23 >= 2007-04-20 aree false.

But when you try to create a {min/max}{In/Ex}clusive facet for a derived-from-integer datatype, it is illegal to use a non-integer value as its {value}.

> In fact, you seem to be using "incomparable" to describe a quite different
> relationship: a relationship where two values *can* be compared,

If I understand what you mean by "*can* be compared", any two values "can be compared".  Often the comparison returns false.  We are, I believe, quite clear that "incomparable" means simply "neither equal, nor less-than, nor greater-than".  If any of those relations returns true for a given pair, then the incomparable relation returns false; if all three return false, then incomparable returns true.  How can we make this clearer?

>                                                                  and the
> outcome of the comparison is a value akin to "unknown".

Surely not "unknown".  As illustrated above, for any two arguments to the relation, the value (true or false) returned by (i.e., the "outcome of") the comparison is certainly ascertainable, at least if the two arguments are members of (any) datatype value spaces.  I fail to see how you could describe the return value as "unknown" if you know what the two argument values are.

> Furthermore, people are unlikely to read the phrase "as far as this
> specification is concerned" as excluding comparisons done within an assertion,
> where the XPath comparison semantics apply - and these can certainly result in
> errors.

Now that is, at the very least, an interesting point.  I'm not the expert on Schema's use of XPath.  Someone else will have to tell me whether or not an XPath comparison that results in an error, as opposed to the comparison being false, can be legitimately used in Schema.  I don't pretend to know.  If that can arise, then we need a Note acknowledging it.  Can you open a new bug stating that problem and providing an example?
Comment 10 Dave Peterson 2007-04-20 18:24:46 UTC
(In reply to comment #9)

> Not that they cannot be compared.  In fact, they can be compared, and the
> answers are that 23 <> 2007-04-20 is true, and 23 = 2007-04-20, 23 <
> 2007-04-20, and 23 > 2007-04-20 are all three false.  Of the derived
> comparisons, 23 != 2007-04-20 is true, and 23 <= 2007-04-20 and 23 >=
> 2007-04-20 aree false.

As I think about it, in fact "<>" is just as much a derived relation as are "!=", "<=", and ">=".  The other three are in some sense primitive and these four (and the negations of "<" and ">", which we do not suggest a notation for, and do not use) are defined in terms of the first three.  Admittedly other choices of three primitives from the nine are possible and the other six defined therefrom; this just happens to be a common choice and the one we've used to explain our use of all.
Comment 11 David Ezell 2007-08-24 21:05:02 UTC
On the telcon:

Noah Mendelsohn expressed: I think we say two values that are incomparable test as neither greater, nor less than, nor equal. Whether an attempt to compare such incomparable values causes an error in some particular system that >uses< datatypes is up to the specification for that system. 

The working group agreed that the editors be allowed to introduce a note to this effect.  The WG will see this item on a consent agenda in the future (no further scheduled discussion.)
Comment 12 Dave Peterson 2007-09-25 02:10:02 UTC
Deleting keyword "decided"; it's not decided until the consent agenda containing the note to be added is approved.
Comment 13 C. M. Sperberg-McQueen 2008-02-27 15:26:15 UTC
A wording proposal intended to resolve this issue among others was sent to
the WG 27 February 2008:
http://www.w3.org/XML/Group/2004/06/xmlschema-2/datatypes.omni200802.xml
(member-only link).
Comment 14 C. M. Sperberg-McQueen 2008-02-29 19:40:20 UTC
At its telcon today, the XML Schema WG adopted the wording proposal at 
http://www.w3.org/XML/Group/2004/06/xmlschema-2/datatypes.omni200802.xml
(member-only link), and believes this issue now to be resolved.  

Since the originator is a WG member and was present on the call, it seems
likely that you agree that the issue is successfully resolved.  Still, 
please so indicate by changing the status of the bug report to CLOSED.
Or reopen it if you discover a snag.  Silence will be taken to mean consent.