<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>8606</bug_id>
          
          <creation_ts>2010-01-03 19:25:34 +0000</creation_ts>
          <short_desc>ambiguous ampersand does not include character references</short_desc>
          <delta_ts>2010-10-04 14:28:14 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>HTML WG</product>
          <component>pre-LC1 HTML5 spec (editor: Ian Hickson)</component>
          <version>unspecified</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Windows NT</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>NEEDSINFO</resolution>
          
          
          <bug_file_loc>http://dev.w3.org/html5/spec/Overview.html#character-references</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Don Brutzman">brutzman</reporter>
          <assigned_to name="Ian &apos;Hixie&apos; Hickson">ian</assigned_to>
          <cc>brutzman</cc>
    
    <cc>ian</cc>
    
    <cc>mike</cc>
    
    <cc>public-html-admin</cc>
    
    <cc>public-html-wg-issue-tracking</cc>
          
          <qa_contact name="HTML WG Bugzilla archive list">public-html-bugzilla</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>30338</commentid>
    <comment_count>0</comment_count>
    <who name="Don Brutzman">brutzman</who>
    <bug_when>2010-01-03 19:25:34 +0000</bug_when>
    <thetext>9.1.4 Character references

Draft document sayeth:

        &quot;An ambiguous ampersand is a U+0026 AMPERSAND character (&amp;)
        that is followed by some text other than a space character,
        a U+003C LESS-THAN SIGN character (&lt;), or another
        U+0026 AMPERSAND character (&amp;).&quot;

probably should insert 2nd line as follows

        An ambiguous ampersand is a U+0026 AMPERSAND character (&amp;)
        that is not a valid character reference, and</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>30658</commentid>
    <comment_count>1</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2010-01-06 12:03:04 +0000</bug_when>
    <thetext>EDITOR&apos;S RESPONSE: This is an Editor&apos;s Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: An ambiguous ampersand is text. A character reference is not text. Therefore an ambiguous ampersand can never be a character reference.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>31218</commentid>
    <comment_count>2</comment_count>
    <who name="Don Brutzman">brutzman</who>
    <bug_when>2010-01-26 07:56:10 +0000</bug_when>
    <thetext>wellll, i guess if you strictly parse the linked definition of
&lt;a href=&quot;#syntax-text&quot; title=&quot;syntax-text&quot;&gt;text&lt;/a&gt;,
then the complete character reference only comprises a single
&lt;a href=&quot;#syntax-text&quot; title=&quot;syntax-text&quot;&gt;text&lt;/a&gt;
character.  nevertheless the individual characters that follow that initial ampersand within a character reference would otherwise be considered plain text if they weren&apos;t in that context.

due to overloaded terminology, this logic can get convoluted and doesn&apos;t seem immediately obvious to a reader trying to understand the definition.

re-reading the definition for ambiguous ampersand still seems to me to include the characters making up character reference.

inserting the phrase &quot;that is not a valid character reference&quot; explictly disambiguates such a possible misperception and reinforces the sense of the definition.  thus i again suggest inserting that phrase.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>31930</commentid>
    <comment_count>3</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2010-02-14 02:55:29 +0000</bug_when>
    <thetext>EDITOR&apos;S RESPONSE: This is an Editor&apos;s Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Did Not Understand Request
Change Description: no spec change
Rationale: I am completely at a loss as to what comment 2 is trying to say.

The concept of ambiguous ampersands is used to restrict what values &quot;text&quot; can have. Its purpose is to make it non-conforming to have an ampersand followed by something that would, when parsed, be confused for a character reference. As such, the only characters that are allowed after &amp; are space characters, &quot;&lt;&quot; characters, and other &quot;&amp;&quot; characters. All other characters, including all the characters that would form a character reference, are not allowed, and thus a &amp; followed by any such character (e.g. &quot;a&quot; or &quot;#&quot;) is am ambiguous ampersand.

If we were to _exclude_ characters that formed character references, then this would completely fail to achieve the stated goal. If &quot;&amp;&quot; followed by &quot;gt;&quot; was _not_ an ambiguous ampersand, then there&apos;d be no way to distinguish the text consisting of the four characters &quot;&amp;&quot;, &quot;g&quot;, &quot;t&quot;, &quot;;&quot; from a single character reference &quot;&amp;gt;&quot;, and yet both would be legal.

This is why ambiguous ampersands are defined as they are.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>