<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>8651</bug_id>
          
          <creation_ts>2010-01-05 14:54:22 +0000</creation_ts>
          <short_desc>[SER] What does it mean to compare without consideration of case?</short_desc>
          <delta_ts>2010-06-29 13:54:42 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>XPath / XQuery / XSLT</product>
          <component>Serialization 1.0</component>
          <version>Recommendation</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Windows XP</op_sys>
          <bug_status>CLOSED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc>http://www.w3.org/TR/2007/REC-xslt-xquery-serialization-20070123/#HTML_MARKUP</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Henry Zongaro">zongaro</reporter>
          <assigned_to name="Henry Zongaro">zongaro</assigned_to>
          
          
          <qa_contact name="Mailing list for public feedback on specs from XSL and XML Query WGs">public-qt-comments</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>30533</commentid>
    <comment_count>0</comment_count>
    <who name="Henry Zongaro">zongaro</who>
    <bug_when>2010-01-05 14:54:22 +0000</bug_when>
    <thetext>In two places, the Serialization recommendation indicates that comparisons of strings of characters should be performed without regard to case.  In section 7.1,[1] the second paragraph following the numbered list begins, &quot;The HTML output method MUST recognize the names of HTML elements regardless of case.&quot;  In section 6.1.13, we have &quot;making the comparison without consideration of casing and leading/trailing spaces&quot; [2]

Two errata have also been issued that use a similar formulation.  Erratum SE.E5 [3] added the phrase &quot;making the comparison without consideration of case and leading or trailing spaces&quot; to section 7.4.13.  The yet-to-be-published erratum SE.E14 [4] adds the phrase &quot;if the value of the attribute node actually is equal to the name of the attribute without regard to case&quot; to section 7.2.

[1] http://www.w3.org/TR/2007/REC-xslt-xquery-serialization-20070123/#HTML_MARKUP
[2] http://www.w3.org/TR/2007/REC-xslt-xquery-serialization-20070123/#XHTML_INCLUDE-CONTENT-TYPE
[3] http://www.w3.org/XML/2007/qt-errata/xslt-xquery-serialization-errata.html#E5
[4] http://www.w3.org/Bugs/Public/show_bug.cgi?id=7829</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>30536</commentid>
    <comment_count>1</comment_count>
    <who name="Henry Zongaro">zongaro</who>
    <bug_when>2010-01-05 15:25:23 +0000</bug_when>
    <thetext>I propose the following resolution to this problem:

In section 1.1,[5] add the definition:

. [Definition:  Where this specification indicates that two strings are to be &lt;b&gt;compared without regard to case&lt;/b&gt;, the serializer &lt;rfc2119&gt;MUST&lt;/rfc2119&gt; translate any characters in the range #x41 (LATIN CAPITAL LETTER A) to #x5A (LATIN CAPITAL LETTER Z), inclusive, to the corresponding lower-case letters in the range #x61 (LATIN SMALL LETTER A) to #x7A (LATIN SMALL LETTER Z) only for the purposes of making the comparison.  The comparison succeeds if the two strings are the same length and the code point of each characters in the first string is equal to the code point of the character in the corresponding position in the second string.

In section 7.1,[1] change &quot;regardless of case&quot; to &quot;&lt;termref def=&quot;caseless-compare&quot;&gt;making the comparison without regard to case&lt;/termref&gt;&quot;.

In section 6.1.13,[2] change &quot;making the comparison without consideration of
casing and leading/trailing spaces&quot; to &quot;&lt;termref def=&quot;caseless-compare&quot;&gt;making the comparison without regard to case&lt;/termref&gt;, after first stripping leading and trailing spaces from the value of the attribute solely for the purposes of comparison.&quot;

In section 7.4.13 as modified by erratum SE.E5,[3] change &quot;making the comparison without consideration of case and leading or trailing spaces&quot; to to &quot;&lt;termref def=&quot;caseless-compare&quot;&gt;making the comparison without regard to case&lt;/termref&gt;, after first stripping leading and trailing spaces from the value of the attribute solely for the purposes of comparison.&quot;

In section 7.2 as modified by erratum SE.E14,[4] change &quot;is equal to the name of
the attribute without regard to case&quot; to &quot;is equal to the name of the attribute, &lt;termref def=&quot;caseless-compare&quot;&gt;making the comparison without regard to case&lt;/termref&gt;.&quot;

[5] http://www.w3.org/TR/2007/REC-xslt-xquery-serialization-20070123/#terminology</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>30678</commentid>
    <comment_count>2</comment_count>
    <who name="Henry Zongaro">zongaro</who>
    <bug_when>2010-01-06 16:49:48 +0000</bug_when>
    <thetext>It&apos;s probably clear from comment #1, but I believe the intent was that the case of a character is ignored only if the character is in the ASCII range.  So, for instance, #x131 (LATIN SMALL LETTER DOTLESS I) would ordinarily be treated as equal to #x49 (LATIN CAPITAL LETTER I) in a caseless string comparison, but an element named &amp;#305; should not be recognized as an HTML I element under the rules of section 7.1.

The most recent public draft of HTML 5.0 [6] defines the term &quot;ASCII case-insensitive&quot; to mean the same thing as the term &quot;compared without regard to case&quot; that I&apos;ve proposed.  That draft uses that term in defining Boolean attributes, in defining the permitted values of enumerated attributes (including http-equiv), and defines HTML tag names to use characters only in the ASCII range - all the places noted by this bug report.  There&apos;s no reason to believe that HTML 5.0 has placed additional constraints in these areas rather than simply clarified the rules.

[6] http://www.w3.org/TR/2009/WD-html5-20090825/infrastructure.html#case-sensitivity-and-string-comparison</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>31021</commentid>
    <comment_count>3</comment_count>
    <who name="Henry Zongaro">zongaro</who>
    <bug_when>2010-01-13 19:08:08 +0000</bug_when>
    <thetext>At their joint call of the XQuery and XSL Working Groups of 2010-01-12, the working groups adopted the proposal in comment #1.[7]  As not many XSL WG members were present, I will bring this back to the XSL Working Group for ratification.

[7] http://lists.w3.org/Archives/Member/w3c-xsl-query/2010Jan/0055.html (Member-only link to minutes of joint teleconference)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>33649</commentid>
    <comment_count>4</comment_count>
    <who name="Henry Zongaro">zongaro</who>
    <bug_when>2010-03-17 14:39:31 +0000</bug_when>
    <thetext>Bug was marked resolved/fixed by an unknown intruder.  Reopening.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>35951</commentid>
    <comment_count>5</comment_count>
    <who name="Henry Zongaro">zongaro</who>
    <bug_when>2010-06-03 21:12:22 +0000</bug_when>
    <thetext>At its teleconference of 3 June 2010,[8] the XSL Working Group ratified the
decision to adopt the proposal made in comment #1.  This will be Serialization
erratum SE.E17.

[8] http://lists.w3.org/Archives/Member/w3c-xsl-wg/2010Jun/0011.html
(Member-only link)</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>