This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 5290 - [XQuery] Unclear meaning of "collation" in order-by clause
Summary: [XQuery] Unclear meaning of "collation" in order-by clause
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XQuery 1.0 (show other bugs)
Version: Recommendation
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Don Chamberlin
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-11-27 18:38 UTC by Don Chamberlin
Modified: 2008-09-03 19:10 UTC (History)
0 users

See Also:


Attachments

Description Don Chamberlin 2007-11-27 18:38:48 UTC
In XQuery Section 3.8.3 (Order By and Return clauses), the "greater-than" relationship between two ordering keys V and W, when a collation C is specified, is defined by fn:compare(V, W, C). The document does not specify that this definition applies only if the keys are strings. Consider the following case:

order by $salary collation "fr-ca" 

If the type of $salary is xs:decimal, an error will result because fn:compare() requires strings for its first two parameters, and xs:decimal is not promotable to xs:string.

If this is really the behavior we want, we should add a note that says this explicitly and designates the error code (XPST0017, No Such Function). If we think this is a confusing and unhelpful error code, we should invent a new code for this case. Alternatively, we could say that the collation applies only if the ordering keys are valid operands of fn:compare (strings or promotable to strings); otherwise the keys are compared using gt and the collation clause is ignored.
Comment 1 Michael Kay 2007-11-27 22:12:22 UTC
It's worth pointing out that in functions like distinct-values, min, and max, the collation is ignored if not required. I think that should be the behaviour for order-by as well. (Indeed, I assumed it was; and making the change in this direction would definitely involve less disruption for users.)
Comment 2 Don Chamberlin 2008-01-08 19:49:29 UTC
The working group discussed this bug report on 8 Jan 2008. Two alternatives were discussed:

(1) Close the bug with no changes. In this case, specifying a collation for a non-string ordering-key is an error. It was observed that this behavior is inconsistent with max, min, distinct-values, and possibly a future group-by.

(2) Change XQuery Section 3.8.3 as follows:
When two orderspec values V and W are compared to determine their relative order in the ordering sequence, the "greater-than" relationship is defined as follows:
When the orderspec specifies "empty least", the following rules are applied in order:
(a) If V is an empty sequence and W is not an empty sequence, then W "greater-than" V is true.
(b) If V is NaN and W is neither NaN nor an empty sequence, then W "greater-than" V is true.
(c) If a specific collation C is specified, and V and W are both of type xs:string or are convertible to xs:string by subtype substitution and/or type promotion, and fn:compare(V, W, C) is less than zero, then W "greater than" V is true.
(d) If W gt V is true, then W "greater-than" V is true.
(e) If none of the above rules apply, then W "greater-than" V is false.
Analogous changes apply when the orderspec specifies "empty greatest".

If adopted, these changes could be made effective immediately by an erratum to XQuery 1.0, or could be introduced in XQuery 1.1 as relaxation of an error condition.

This bug report remains open pending discussion of these alternatives.
--Don Chamberlin (for the Query Working Group)
Comment 3 Don Chamberlin 2008-01-26 22:14:31 UTC
On 23 Jan 2008, the Query Working Group considered this bug report and decided to accept alternative (2) in Comment #2. This will have the effect that a specified collation is ignored if the ordering key is not convertible to the type xs:string. This change will be published in an erratum to XQuery 1.0.
--Don Chamberlin (for the Query Working Group)
Comment 4 Don Chamberlin 2008-09-03 19:10:23 UTC
Mike Kay has pointed out that the text in Comment #2 does not accurately reflect the intent of the working group that ordering is based on fn:compare() when a collation is specified and the ordering keys are convertible to strings. As stated in Comment #2, if fn:compare() returns a non-negative result (Rule c), the algorithm falls through to Rule d and applies the "gt" operator. This is a mistake.

The corrected rules are as follows:

When the orderspec specifies empty least, the following rules are applied in order:
(1) If V is an empty sequence and W is not an empty sequence, then W greater-than V is true.
(2) If V is NaN and W is neither NaN nor an empty sequence, then W greater-than V is true.
(3) If a specific collation C is specified, and V and W are both of type xs:string or are convertible to xs:string by subtype substitution and/or type promotion, then:
If fn:compare(V, W, C) is less than zero, then W greater-than V is true; otherwise W greater-than V is false.
(4) If none of the above rules apply, then:
If W gt V is true, then W greater-than V is true; otherwise W greater-than V is false.

An analogous set of rules apply when the orderspec specifies empty greatest.

Since this change simply corrects an error in accurately reflecting the decision of the working group, I am leaving this bug report in "Closed" status.

Don Chamberlin