This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Bug #791 (member-only) against the XSLT test suite points out that case-order on xsl:sort is not clearly described. The sum total of the description is: "The case-order attribute indicates whether the desired collation should sort upper-case letters before lower-case or vice versa. The effective value of the attribute must be either lower-first (indicating that lower-case letters precede upper-case letters in the collating sequence) or upper-first (indicating that upper-case letters precede lower-case)." As the bug report against the test suite points out, this could be read as indicating that case-order="lower-first" is supposed to mean that lower-case "z" precedes upper-case "A" in the collating sequence. I do not think this is the intent. The intent was stated (not especially well) by example in the XSLT 1.0 specification: For example, if lang="en", then A a B b are sorted with case-order="upper-first" and a A b B are sorted with case-order="lower-first".
Note that there are additional considerations that need be taken into account when formulating a clear specification for the beahviour. Such as: 1) Which case mappings are to be used? My guess is that the default Unicode legacy mappings would be appropriate (strings don't change length). In which case German S SHARP (I think that's the Unicode name) won't have any cased version. 2) What happens to title-cased letters?
In response to comment #1, I don't think we need to go into that level of detail. We describe case-order (like lang) as requesting use of a collation with certain characteristics, and we can describe this in terms of a property, for example that for every string S, compare(lower-case(S), upper-case(S), $coll) < 1 without prescribing every detail of the collation's behaviour. An example of a collation that has this property is one that sorts pole, Pole, polish, Polish
In action A-2008-02-07-005 I was asked to produce text to fix this. The current text is: The case-order attribute indicates whether the desired collation should sort upper-case letters before lower-case or vice versa. The effective value of the attribute must be either lower-first (indicating that lower-case letters precede upper-case letters in the collating sequence) or upper-first (indicating that upper-case letters precede lower-case). Proposal: Add after the existing text. "When lower-first is requested, the returned collation SHOULD have the property that for any string S, lower-case(S) collates before upper-case(S); when upper-first is requested, the returned collation SHOULD have the property that for any string S, upper-case(S) collates before lower-case(S). When case of letters is a tertiary characteristic, as in the Unicode Collation Algorithm, choosing upper-first will have the effect that, for example, StAndrew collates after Stand but before Standrew."
The proposed text still seems to suggest (to me) that, for upper first, Z should collate before a and, for lower first, z should collate before A.
I propose to change the text: "The case-order attribute indicates whether the desired collation should sort upper-case letters before lower-case or vice versa. The effective value of the attribute must be either lower-first (indicating that lower-case letters precede upper-case letters in the collating sequence) or upper-first (indicating that upper-case letters precede lower-case)." to: "The case-order attribute indicates whether the desired collation should sort upper-case variants of a letter before their lower-case variants or vice versa. The effective value of the attribute must be either lower-first (indicating that the lower-case variant of a letter precedes the upper-case variant of the same letter in the collating sequence) or upper-first (indicating that the upper-case variant of a letter precedes the lower-case variant). If the letter has an additional title-case variant, then that should be treated as if it were an ypper-case variant with respect to the lower-case variant"
In action A2008-03-13-003 I was asked to try again. Proposal: Add after the existing text. "When lower-first is requested, the returned collation SHOULD have the property that when two strings differ only in the case of one or more characters, then a string in which the first differing character is lower-case should precede a string in which the corresponding character is title-case, which should in turn precede a string in which the corresponding character is upper-case. When upper-first is requested, the returned collation SHOULD have the property that when two strings differ only in the case of one or more characters, then a string in which the first differing character is upper-case should precede a string in which the corresponding character is title-case, which should in turn precede a string in which the corresponding character is lower-case." For example, if lower-first is requested, then a sorted sequence might be "MacAndrew, macintosh, macIntosh, Macintosh, MacIntosh, macintoshes, Macintoshes, McIntosh". If upper-first is requested, the same sequence would sort as "MacAndrew, MacIntosh, Macintosh, macIntosh, macintosh, MacIntoshes, macintoshes, McIntosh"
It was agreed on 10 Jul 2008 to use the text in comment #6 but reinstating the XSLT 1.0 example (A a B b) for additional clarity.
Erratum E26 has been drafted. Colin, as the person who effectively raised this problem (as a bug against the test suite), I would be grateful if you would close it if you are satisfied.
Closed as requested.