<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>6020</bug_id>
          
          <creation_ts>2008-09-02 18:37:19 +0000</creation_ts>
          <short_desc>Validator incorrectly uses strings as elements</short_desc>
          <delta_ts>2008-09-04 17:49:56 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>Validator</product>
          <component>check</component>
          <version>HEAD</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>INVALID</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Kevin Hunter">hunteke</reporter>
          <assigned_to name="This bug has no owner yet - up for the taking">dave.null</assigned_to>
          <cc>karns.17</cc>
    
    <cc>ot</cc>
          
          <qa_contact name="qa-dev tracking">www-validator-cvs</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>21742</commentid>
    <comment_count>0</comment_count>
      <attachid>566</attachid>
    <who name="Kevin Hunter">hunteke</who>
    <bug_when>2008-09-02 18:37:19 +0000</bug_when>
    <thetext>Created attachment 566
HTML 4.01 Strict compliant file

Basically, the parser (0.8.3 I think) is interpreting text inside of a Javascript string as tags.

This is best highlighted by the two attachments I&apos;ll add.  The first attachment is HTML 4.01 Strict compliant.

The second attachment is a unified diff that will break the parser.  The parser will assume that strings in line 9 and 10 begin elements, but this is incorrect.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>21743</commentid>
    <comment_count>1</comment_count>
      <attachid>567</attachid>
    <who name="Kevin Hunter">hunteke</who>
    <bug_when>2008-09-02 18:39:24 +0000</bug_when>
    <thetext>Created attachment 567
patch against previous attachment to highlight parser bug

$ patch -o html_borked.html &lt; html_diff.diff

Basically, the parser will think that lines 9 and 10 begin a &apos;&lt;script&gt;&apos; tag, and also end a &apos;&lt;/scr&gt;&apos; tag.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>21785</commentid>
    <comment_count>2</comment_count>
    <who name="Olivier Thereaux">ot</who>
    <bug_when>2008-09-04 15:21:07 +0000</bug_when>
    <thetext>(In reply to comment #0)
&gt; Basically, the parser (0.8.3 I think) is interpreting text inside of a
&gt; Javascript string as tags.

Which it should, per the specification.

See e.g: http://htmlhelp.com/tools/validator/problems.html#script

</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>21787</commentid>
    <comment_count>3</comment_count>
    <who name="Kevin Hunter">hunteke</who>
    <bug_when>2008-09-04 15:46:19 +0000</bug_when>
    <thetext>Doh!  And there&apos;s even a link to an FAQ about scripts sections in the output.  Thanks.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>21788</commentid>
    <comment_count>4</comment_count>
      <attachid>575</attachid>
    <who name="Jason">karns.17</who>
    <bug_when>2008-09-04 16:51:53 +0000</bug_when>
    <thetext>Created attachment 575
Example HTML</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>21789</commentid>
    <comment_count>5</comment_count>
    <who name="Jason">karns.17</who>
    <bug_when>2008-09-04 16:57:59 +0000</bug_when>
    <thetext>Yes, however, it should not be semantically parsing HTML content in script
tags. In other words, as long as the content of the script tag is valid XML,
should it not be valid? For instance, in my attachment I receive two errors.
The first states I have an invalid value for my &apos;id&apos; attribute. And the second
that the element &apos;li&apos; does not belong there.
</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>21790</commentid>
    <comment_count>6</comment_count>
    <who name="Olivier Thereaux">ot</who>
    <bug_when>2008-09-04 17:22:20 +0000</bug_when>
    <thetext>(In reply to comment #5)
&gt; Yes, however, it should not be semantically parsing HTML content in script
&gt; tags. 

Like it or not, that is what the specifications for (X)HTML say. e.g.
http://www.w3.org/TR/html4/appendix/notes.html#h-B.3.2.1

</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>21791</commentid>
    <comment_count>7</comment_count>
      <attachid>576</attachid>
    <who name="Jason">karns.17</who>
    <bug_when>2008-09-04 17:30:13 +0000</bug_when>
    <thetext>Created attachment 576
XHTML with no ETAGO in script</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>21792</commentid>
    <comment_count>8</comment_count>
    <who name="Jason">karns.17</who>
    <bug_when>2008-09-04 17:33:51 +0000</bug_when>
    <thetext>I&apos;ve added an attachment that has html content nested inside the script tag. Per the spec at http://www.w3.org/TR/html4/appendix/notes.html#h-B.3.2.1 there is no ETAGO (&quot;&lt;/...&quot;) which terminates the script tag.  Still I am getting two validation errors which are incorrect per the spec.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>21793</commentid>
    <comment_count>9</comment_count>
    <who name="Olivier Thereaux">ot</who>
    <bug_when>2008-09-04 17:49:56 +0000</bug_when>
    <thetext>(In reply to comment #8)
&gt; I&apos;ve added an attachment that has html content nested inside the script tag.
&gt; Per the spec at http://www.w3.org/TR/html4/appendix/notes.html#h-B.3.2.1 there
&gt; is no ETAGO (&quot;&lt;/...&quot;) which terminates the script tag.  Still I am getting two
&gt; validation errors which are incorrect per the spec.

I&apos;m afraid not. Your example has markup (inside a &lt;script&gt;, but the point is, it does NOT matter to an HTML parser) including an id starting with a $ sign. That&apos;s not valid.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0"
          >
            <attachid>566</attachid>
            <date>2008-09-02 18:37:19 +0000</date>
            <delta_ts>2008-09-02 18:37:19 +0000</delta_ts>
            <desc>HTML 4.01 Strict compliant file</desc>
            <filename>html_compliant.html</filename>
            <type>text/html</type>
            <size>305</size>
            <attacher name="Kevin Hunter">hunteke</attacher>
            
              <data encoding="base64">PCFET0NUWVBFIEhUTUwgUFVCTElDICItLy9XM0MvL0RURCBIVE1MIDQuMDEvL0VOIiAiaHR0cDov
L3d3dy53My5vcmcvVFIvaHRtbDQvc3RyaWN0LmR0ZCI+CjxodG1sPgo8aGVhZD4KCTxtZXRhIGh0
dHAtZXF1aXY9J2NvbnRlbnQtdHlwZScgY29udGVudD0ndGVzdC9odG1sOyBjaGFyc2V0PVVURi04
Jz4KCgk8dGl0bGU+VGVzdDwvdGl0bGU+Cgk8c2NyaXB0IHR5cGU9InRleHQvamF2YXNjcmlwdCI+
Cgk8L3NjcmlwdD4KPC9oZWFkPgo8Ym9keT4KPHA+TG9yZW0gSXBzdW0geWFkYSB5YWRhIHlhZGEu
PC9wPgo8L2JvZHk+CjwvaHRtbD4=
</data>

          </attachment>
          <attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0"
          >
            <attachid>567</attachid>
            <date>2008-09-02 18:39:24 +0000</date>
            <delta_ts>2008-09-02 18:39:24 +0000</delta_ts>
            <desc>patch against previous attachment to highlight parser bug</desc>
            <filename>html_diff.diff</filename>
            <type>text/plain</type>
            <size>458</size>
            <attacher name="Kevin Hunter">hunteke</attacher>
            
              <data encoding="base64">LS0tIGh0bWxfY29tcGxpYW50Lmh0bWwJMjAwOC0wOS0wMiAxMzo1OTowOC4yMjU2NzAwNjggLTA0
MDAKKysrIGh0bWxfbm9fd29yay5odG1sCTIwMDgtMDktMDIgMTM6NTk6MjguNDIyNjY5NjYzIC0w
NDAwCkBAIC01LDYgKzUsMTAgQEAKIAogCTx0aXRsZT5UZXN0PC90aXRsZT4KIAk8c2NyaXB0IHR5
cGU9InRleHQvamF2YXNjcmlwdCI+CisJKGZ1bmN0aW9uKCkgeworCQlkb2N1bWVudC53cml0ZSgn
PHNjcmlwdCB0eXBlPSJ0ZXh0L2phdmFzY3JpcHQiIHNyYz0iJywgcCwgIi8iLCByb290X2xpYiwg
Jy90ZXN0LTEuanMiPjwvc2NyJywgJ2lwdD4nKTsKKwkJZG9jdW1lbnQud3JpdGUoJzxzY3JpcHQg
dHlwZT0idGV4dC9qYXZhc2NyaXB0IiBzcmM9IicsIHAsICIvIiwgcm9vdF9saWIsICcvdGVzdC0y
LmpzIj48L3NjcicsICdpcHQ+Jyk7CisJfSkoKTsKIAk8L3NjcmlwdD4KIDwvaGVhZD4KIDxib2R5
Pgo=
</data>

          </attachment>
          <attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0"
          >
            <attachid>575</attachid>
            <date>2008-09-04 16:51:53 +0000</date>
            <delta_ts>2008-09-04 16:51:53 +0000</delta_ts>
            <desc>Example HTML</desc>
            <filename>script.html</filename>
            <type>text/html</type>
            <size>446</size>
            <attacher name="Jason">karns.17</attacher>
            
              <data encoding="base64">PCFET0NUWVBFIGh0bWwgUFVCTElDICItLy9XM0MvL0RURCBYSFRNTCAxLjAgU3RyaWN0Ly9FTiIg
Imh0dHA6Ly93d3cudzMub3JnL1RSL3hodG1sMS9EVEQveGh0bWwxLXN0cmljdC5kdGQiPg0KPGh0
bWwgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzE5OTkveGh0bWwiPg0KCTxoZWFkPg0KCQk8bWV0
YSBodHRwLWVxdWl2PSJDb250ZW50LVR5cGUiIGNvbnRlbnQ9InRleHQvaHRtbDsgY2hhcnNldD11
dGYtOCIgLz4NCgkJPHRpdGxlPlVudGl0bGVkIERvY3VtZW50PC90aXRsZT4NCiAgICAgICAgPHNj
cmlwdCBpZD0idGVtcGxhdGVfbmFycmF0aXZlIiB0eXBlPSJ0ZXh0L2h0bWwiPg0KICAgICAgICAg
ICAgPGxpIGlkPSIkez1zZWN0aW9ufS1uYXJyLXEkez1xdWVzdGlvbn0iPjwvbGk+DQoJICAgIDwv
c2NyaXB0Pg0KCTwvaGVhZD4NCgk8Ym9keT4NCgk8L2JvZHk+DQo8L2h0bWw+DQo=
</data>

          </attachment>
          <attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0"
          >
            <attachid>576</attachid>
            <date>2008-09-04 17:30:13 +0000</date>
            <delta_ts>2008-09-04 17:30:13 +0000</delta_ts>
            <desc>XHTML with no ETAGO in script</desc>
            <filename>script.html</filename>
            <type>text/html</type>
            <size>443</size>
            <attacher name="Jason">karns.17</attacher>
            
              <data encoding="base64">PCFET0NUWVBFIGh0bWwgUFVCTElDICItLy9XM0MvL0RURCBYSFRNTCAxLjAgU3RyaWN0Ly9FTiIg
Imh0dHA6Ly93d3cudzMub3JnL1RSL3hodG1sMS9EVEQveGh0bWwxLXN0cmljdC5kdGQiPg0KPGh0
bWwgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzE5OTkveGh0bWwiPg0KCTxoZWFkPg0KCQk8bWV0
YSBodHRwLWVxdWl2PSJDb250ZW50LVR5cGUiIGNvbnRlbnQ9InRleHQvaHRtbDsgY2hhcnNldD11
dGYtOCIgLz4NCgkJPHRpdGxlPlVudGl0bGVkIERvY3VtZW50PC90aXRsZT4NCiAgICAgICAgPHNj
cmlwdCBpZD0idGVtcGxhdGVfbmFycmF0aXZlIiB0eXBlPSJ0ZXh0L2h0bWwiPg0KICAgICAgICAg
ICAgPGxpIGlkPSIkez1zZWN0aW9ufS1uYXJyLXEkez1xdWVzdGlvbn0iIC8+DQoJICAgIDwvc2Ny
aXB0Pg0KCTwvaGVhZD4NCgk8Ym9keT4NCgk8L2JvZHk+DQo8L2h0bWw+DQo=
</data>

          </attachment>
      

    </bug>

</bugzilla>