<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>13604</bug_id>
          
          <creation_ts>2011-08-03 12:52:36 +0000</creation_ts>
          <short_desc>CDATA sections are no allowed except in foreign content</short_desc>
          <delta_ts>2013-11-02 11:12:45 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>HTML WG</product>
          <component>LC1 HTML/XHTML Compatibility Authoring Guide (ed: Eliot Graff)</component>
          <version>unspecified</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc>http://dev.w3.org/html5/html-polyglot/html-polyglot.html#named-entity-references</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          <dependson>23593</dependson>
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Philippe Le Hegaret">plh</reporter>
          <assigned_to name="Leif Halvard Silli">xn--mlform-iua</assigned_to>
          <cc>davidc</cc>
    
    <cc>eliotgra</cc>
    
    <cc>hsivonen</cc>
    
    <cc>mike</cc>
    
    <cc>public-html-admin</cc>
    
    <cc>public-html-wg-issue-tracking</cc>
    
    <cc>xn--mlform-iua</cc>
          
          <qa_contact name="HTML WG Bugzilla archive list">public-html-bugzilla</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>52156</commentid>
    <comment_count>0</comment_count>
    <who name="Philippe Le Hegaret">plh</who>
    <bug_when>2011-08-03 12:52:36 +0000</bug_when>
    <thetext>The HTML5 spec is clear that &quot;CDATA sections can only be used in foreign content (MathML or SVG).&quot; [1].

However, the polyglot spec is mostly silent on those. Tidy generates CDATA sections for inline style and script when it outputs XHTML. It would be good to make it that those CDATA markers must not be used.

[1] http://www.w3.org/TR/html5/syntax.html#cdata-sections</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>52232</commentid>
    <comment_count>1</comment_count>
    <who name="David Carlisle">davidc</who>
    <bug_when>2011-08-03 20:20:24 +0000</bug_when>
    <thetext>(In reply to comment #0)
&gt; The HTML5 spec is clear that &quot;CDATA sections can only be used in foreign
&gt; content (MathML or SVG).&quot; [1].
&gt; 
&gt; However, the polyglot spec is mostly silent on those. Tidy generates CDATA
&gt; sections for inline style and script when it outputs XHTML. It would be good to
&gt; make it that those CDATA markers must not be used.
&gt; 
&gt; [1] http://www.w3.org/TR/html5/syntax.html#cdata-sections

the \\&lt;![CDATA  markup doesn&apos;t generate a cdata section if used in an html script element (as &lt; doesn&apos;t start markup in that context) so that usage doesn&apos;t contradict the fact that html doesn&apos;t allow cdata sections except in foreign content.

Like using linebreaks in svg attributes, this usage will cause difference between an htnl and xml DOM, but the difference is largely cosmetic, the difference just being whether the first line of the script that contains a jacascript comment has an empty comment (XML)  or a comment with the characters &lt;!CDATA (from html parsing).

David</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>52238</commentid>
    <comment_count>2</comment_count>
    <who name="Philippe Le Hegaret">plh</who>
    <bug_when>2011-08-03 20:34:35 +0000</bug_when>
    <thetext>I believe you&apos;re correct, but it would nice to clarify section 9.2 In-line Script and Style then. At the moment, it says that polyglot must use safe content, it doesn&apos;t mention anything at all about CDATA sections.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>52245</commentid>
    <comment_count>3</comment_count>
    <who name="David Carlisle">davidc</who>
    <bug_when>2011-08-03 20:43:52 +0000</bug_when>
    <thetext>(In reply to comment #2)
&gt; I believe you&apos;re correct, but it would nice to clarify section 9.2 In-line
&gt; Script and Style then. At the moment, it says that polyglot must use safe
&gt; content, it doesn&apos;t mention anything at all about CDATA sections.

that&apos;s consistent with the aim as currently expressed that polyglot aims to get identical doms. For many purposes that is a more strict requirement than necessary.

there&apos;s nothing wrong with using the 
//&lt;!CDATA[
idiom, but unless the polyglot spec weakens it&apos;s aims to &quot;compatible&quot; DOM for some definition of compatible then it is right to say that the script should not contain a &lt; (so it can&apos;t contain &lt;![CDATA, so there is no need to say anything further about CDATA sections.).

Specifying the requirements for identical dom is probably the right thing for the spec to do, the conditions under which non-identical doms are Ok are probably harder to specify in any generic way.

David</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>52255</commentid>
    <comment_count>4</comment_count>
    <who name="Philippe Le Hegaret">plh</who>
    <bug_when>2011-08-03 21:17:27 +0000</bug_when>
    <thetext>(In reply to comment #3)
&gt; there&apos;s nothing wrong with using the 
&gt; //&lt;!CDATA[
&gt; idiom, but unless the polyglot spec weakens it&apos;s aims to &quot;compatible&quot; DOM for
&gt; some definition of compatible then it is right to say that the script should
&gt; not contain a &lt; (so it can&apos;t contain &lt;![CDATA, so there is no need to say
&gt; anything further about CDATA sections.).

Well, I still believe that being explicit would help the authors out there. There is nothing wrong with using CDATA in XHTML but, in Polyglot, those shouldn&apos;t be used since it won&apos;t produce identical doms.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>53180</commentid>
    <comment_count>5</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2011-08-04 05:07:22 +0000</bug_when>
    <thetext>mass-move component to LC1</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>53210</commentid>
    <comment_count>6</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2011-08-04 05:07:41 +0000</bug_when>
    <thetext>mass-move component to LC1</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>54156</commentid>
    <comment_count>7</comment_count>
    <who name="Henri Sivonen">hsivonen</who>
    <bug_when>2011-08-04 07:47:30 +0000</bug_when>
    <thetext>(In reply to comment #4)
&gt; Well, I still believe that being explicit would help the authors out there.
&gt; There is nothing wrong with using CDATA in XHTML but, in Polyglot, those
&gt; shouldn&apos;t be used since it won&apos;t produce identical doms.

If you desugar the DOMs, they are equivalent, though. The problem with talking about &quot;the DOM&quot; as a shorthand for the document tree is that the DOM has some domain modeling errors--particularly exposing the CDATA syntactic sugar in the data model.

Note that there&apos;s an ongoing attempt to remove this domain modeling error from the Web DOM: https://bugzilla.mozilla.org/show_bug.cgi?id=660660</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>54232</commentid>
    <comment_count>8</comment_count>
    <who name="Henri Sivonen">hsivonen</who>
    <bug_when>2011-08-05 13:26:24 +0000</bug_when>
    <thetext>For clarity, my comment was about text in SVG or MathML parents. Polyglot docs clearly can&apos;t use CDATA sections with HTML parents.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>54233</commentid>
    <comment_count>9</comment_count>
    <who name="Henri Sivonen">hsivonen</who>
    <bug_when>2011-08-05 13:27:55 +0000</bug_when>
    <thetext>In fact, comment 7 is irrelevant on this bug report.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>95597</commentid>
    <comment_count>10</comment_count>
    <who name="Leif Halvard Silli">xn--mlform-iua</who>
    <bug_when>2013-10-31 00:36:44 +0000</bug_when>
    <thetext>(In reply to Philippe Le Hegaret from comment #0)
&gt; The HTML5 spec is clear that &quot;CDATA sections can only be used in foreign
&gt; content (MathML or SVG).&quot; [1].
&gt; 
&gt; However, the polyglot spec is mostly silent on those. Tidy generates CDATA
&gt; sections for inline style and script when it outputs XHTML. It would be good
&gt; to make it that those CDATA markers must not be used.
&gt; 
&gt; [1] http://www.w3.org/TR/html5/syntax.html#cdata-sections

Since this bug was opened, some thing has happened: Polyglot Markup now has a section on raw text elements which, first, speaks about &apos;safe text&apos; and thereafter, about &apos;safe CDATA&apos;: 

http://www.w3.org/TR/html-polyglot/#raw-text-elements

3.6.2 Raw text elements (script and style)
    3.6.2.1 The safe text content option
    3.6.2.2 The safe CDATA option

Thus it is possible that this bug is already fixed, in principle. However, bug 23593 might lead to some changes for the &lt;script&gt; element.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>95729</commentid>
    <comment_count>11</comment_count>
    <who name="Leif Halvard Silli">xn--mlform-iua</who>
    <bug_when>2013-11-02 11:12:11 +0000</bug_when>
    <thetext>EDITOR&apos;S RESPONSE: This is an Editor&apos;s Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the Editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the Tracker Issue; or you may create a Tracker Issue
yourself, if you are able to do so. For more details, see this document:


   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Accepted
Change Description: Referred to polyglot’s rules for safe CDATA.
Rationale: Concurred withe bug filer.

Checked in:

http://dev.w3.org/cvsweb/html5/html-polyglot/html-polyglot.html?rev=1.14
http://dev.w3.org/cvsweb/html5/html-polyglot/html-polyglot.html?rev=1.14

Comment: As told, the spec now already defines safe CDATA. But I described some of the differences between CDATA in script/style vs CDATA in foreign content (including the link that Philip included in comment #0) in polyglot’s section on when to use (named) entities.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>95730</commentid>
    <comment_count>12</comment_count>
    <who name="Leif Halvard Silli">xn--mlform-iua</who>
    <bug_when>2013-11-02 11:12:45 +0000</bug_when>
    <thetext>(In reply to Leif Halvard Silli from comment #11)

&gt; Checked in:
&gt; 
&gt; http://dev.w3.org/cvsweb/html5/html-polyglot/html-polyglot.html?rev=1.14
&gt; http://dev.w3.org/cvsweb/html5/html-polyglot/html-polyglot.html?rev=1.14

Meant:

http://dev.w3.org/cvsweb/html5/html-polyglot/html-polyglot.html?rev=1.13
http://dev.w3.org/cvsweb/html5/html-polyglot/html-polyglot.html?rev=1.14</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>