This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 8852 - HTML4 validator doesn't accept <![CDATA[ <tag></tag> ]]> inside the <script> element
Summary: HTML4 validator doesn't accept <![CDATA[ <tag></tag> ]]> inside the <script> ...
Status: RESOLVED INVALID
Alias: None
Product: Validator
Classification: Unclassified
Component: Parser (show other bugs)
Version: HEAD
Hardware: All All
: P1 major
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: qa-dev tracking
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-02-01 04:48 UTC by Leif Halvard Silli
Modified: 2010-03-21 15:03 UTC (History)
1 user (show)

See Also:


Attachments

Description Leif Halvard Silli 2010-02-01 04:48:28 UTC
The HTML4 validator doesn't accept the following <script> example as validating HTML4 code (while Validator.nu accepts it as validating HTML5 ...)

<script type="text/javascript"><![CDATA[
    document.write("<aa><bb></bb></aa>");
]]></script>

In comparison, the following *does* validate as HTML4 (while Validator.nu *doesn't* accept it as HTML5):

<p><![CDATA[
<aa><bb></bb></aa>
]]></p>

Both of code examples should be stamped as validating HTML4 code! 

According to the spec ( http://www.w3.org/TR/html4/appendix/notes.html#h-B.3.5 ) then HTML4 does include support for marked sections for CDATA content. Marked CDATA sections permits authors to skip using escape tags: <![[CDATA[  <tag>marked section</tag>  ]]>.

If one wants to create polyglot JavaScript scripts for use both in HTML4 as well as XHTML documents, then it is crucial that the validator gives correct information w.r.t.  the validity of <![CDATA[ ... ]]>, since <![CDATA[ ...]]> sections are needed in order to embed scripts in XHTML.  The current validator bug makes it seem unneccessary difficult to create "polyglot scripts".

Some background to explain a possible counter argument:

Section 18.2.4 of HTML4 (http://www.w3.org/TR/html4/interact/scripts.html#h-18.2.4) gives an scripting example where the end tag "</b>" inside the script  element has been escaped with the backslash character: "<\/b>". This has been done in order  that the code doesn't break the SGML parsing rules, according to which the first occurrence of "</" would have had implications.

The code example in section 18.2.4 is supposed to examplify what is said in the preceding text:

]]
HTML documents are constrained to conform to the HTML DTD both before and after processing any SCRIPT elements.
[[


However, it is not expressed anywhere that one cannot use the <![CDATA[ ... ]]> construct  in HTML4 documents!  In XHTML it is customary to precede the start and end "tag" of the CDATA section with a Javascript escape code - for example like the following - in order to be both valid from the XHTML angle and from the JavaScript angle. For example, it can be done like this:

<script type="text/javascript">//<![CDATA[
    document.write("<aa><bb></bb></aa>");
//]]></script>

And this should clearly not be stamped as invalid HTML4 code either!  (Sidenote: The script doesn't appear to run in Internet Explorer (version 6 at least) and also not in Webkit, if  the first content of the script element is a "<![CDATA[ ", so there are several reasons to do the escaping.)
Comment 1 Leif Halvard Silli 2010-03-21 15:03:18 UTC
I realize that this bug, after all, was invalid. Sorry.