<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>1314</bug_id>
          
          <creation_ts>2005-05-09 22:01:54 +0000</creation_ts>
          <short_desc>fn:distinct-values should not accept incomparable types</short_desc>
          <delta_ts>2005-09-29 10:46:52 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>XPath / XQuery / XSLT</product>
          <component>Functions and Operators 1.0</component>
          <version>Last Call drafts</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Windows XP</op_sys>
          <bug_status>CLOSED</bug_status>
          <resolution>WONTFIX</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Don Chamberlin">chamberl</reporter>
          <assigned_to name="Ashok Malhotra">ashok.malhotra</assigned_to>
          
          
          <qa_contact name="Mailing list for public feedback on specs from XSL and XML Query WGs">public-qt-comments</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>3559</commentid>
    <comment_count>0</comment_count>
    <who name="Don Chamberlin">chamberl</who>
    <bug_when>2005-05-09 22:01:55 +0000</bug_when>
    <thetext>The fn:distinct-values function (Section 15.1.6) eliminates duplicates from an 
atomized sequence, based on comparing values by the &quot;eq&quot; operator. However, it 
says &quot;Values that cannot be compared, i.e. the eq operator is not defined for 
their types, are considered to be distinct.&quot; This is problematic for the 
following reasons:

(1) If incomparable values were actually compared by the &quot;eq&quot; operator, an 
error would result (for example, 7 eq &quot;7&quot; raises error XPTY0004.)

(2) An &quot;order by&quot; clause also raises an error (XPTY0004) if it encounters 
incomparable sort keys.

(3) The aggregation functions fn:avg, fn:min, fn:max, and fn:sum also raise an 
error (FORG0006) if they encounter incomparable sort keys.

(4) Implementations of fn:distinct-values based on sorting or hashing are not 
possible under the current definition because they do not accept heterogeneous 
input sequences.

In summary, the current specification of fn:distinct-values is inconsistent 
with the rest of the language and difficult to implement efficiently. The 
definition of fn:distinct-values should be made consistent with other functions 
and operators by raising an error if incomparable values are encountered. This 
will allow &quot;order by&quot; and fn:distinct-values to share a common efficient 
implementation.

Proposal: In the definition of fn:distinct-values, replace the second sentence 
with the following: &quot;If the input sequence contains any two values for which 
the eq operator is not defined, a type error is raised [err:FORG0006].&quot; Also 
add an example: fn:distinct-values(1, 2.3, &quot;Hello&quot;) raises err:FORG0006.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>3562</commentid>
    <comment_count>1</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2005-05-09 22:32:07 +0000</bug_when>
    <thetext>I actually attempted to implement the previous specification of distinct-values,
when non-comparable values were considered an error, and I found it very
difficult to achieve; I found the current specification much easier to implement
(my implementation is based on hashing using a simple hash function based on
both the value and the type label). So let&apos;s base the argument on what&apos;s right
for users, not on implementation factors, which are likely to vary from one
implementor to another.

From a usability point of view, XML Schema supports union types, and the typed
value of a collection of nodes can therefore contain a mixture of different
atomic types. It seems to me a most unfriendly and unnecessary restriction to
tell users that they can&apos;t invoke distinct-values() on a collection whose schema
definition is a union type. 

Note also that although sorting in XQuery disallows mixed types, sorting and
grouping in XSLT do not, so the consistency argument works both ways.

Michael Kay</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>3215</commentid>
    <comment_count>2</comment_count>
    <who name="Ashok Malhotra">ashok.malhotra</who>
    <bug_when>2005-05-18 21:26:26 +0000</bug_when>
    <thetext>This was discussed during the joint WG meeting on 5/17/2005 and there was no
consensus to make this change.

Ashok Malhotra</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>