<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>11904</bug_id>
          
          <creation_ts>2011-01-28 11:50:51 +0000</creation_ts>
          <short_desc>&lt;plaintext&gt; and &lt;xmp&gt; in Polyglot Markup</short_desc>
          <delta_ts>2011-08-04 05:07:40 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>HTML WG</product>
          <component>LC1 HTML/XHTML Compatibility Authoring Guide (ed: Eliot Graff)</component>
          <version>unspecified</version>
          <rep_platform>PC</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>CLOSED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc>http://dev.w3.org/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html#elements-that-cannot-contain-special-characters</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>major</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Leif Halvard Silli">xn--mlform-iua</reporter>
          <assigned_to name="Eliot Graff">eliotgra</assigned_to>
          <cc>eliotgra</cc>
    
    <cc>mike</cc>
    
    <cc>public-html-admin</cc>
    
    <cc>public-html-wg-issue-tracking</cc>
    
    <cc>shadow2531</cc>
    
    <cc>xn--mlform-iua</cc>
          
          <qa_contact name="HTML WG Bugzilla archive list">public-html-bugzilla</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>44826</commentid>
    <comment_count>0</comment_count>
    <who name="Leif Halvard Silli">xn--mlform-iua</who>
    <bug_when>2011-01-28 11:50:51 +0000</bug_when>
    <thetext>The draft text on plaintext and xmp should be deleted:

 ]] Due to the conflict between parsing rules between HTML and XML, polyglot markup uses the following elements only if they do not contain angled brackets (&quot;&lt;&quot; or &quot;&gt;&quot;) or ampersands (&quot;&amp;&quot;).[[

ISSUES:

(1) plaintext/xmp are forbidden in HTML5 - so how do they belong in this draft? (Needs separate bug too.)

   According to Henri Sivonnen, the Polyglot  spec should only describe a subset of XML1 and HTML5.  But which subset? Is it about the valid subset? or the valid and well-formed subset? Or perhaps about the DOM equal subset? Or the valid and well-formed DOM equal subset? Example: When you say that polyglot markup *requires* &lt;colgroup/&gt;, then we are outside both validity and well-formedness - then we are in the &quot;equality&quot; land. And the same goes for &lt;xmp&gt; and &lt;plaintext&gt; - the emphasis, as long as you discuss them at all, is on equality, and not on whether validity or well-formedness.

This question requires a separate bug. But I want to mention it here anyhow. In my view, Polyglot Markup should describe the HTML5-valid (and perhaps also XML 1.0-valid), XML 1.0-well-formed, DOM-equal subset of HTML5. For that reason, plaintext and xmp does not belong in Polyglot Markup, as it is not permitted in HTML5.

(2) For &lt;plaintext&gt;, can conflicting parsing rules ever be avoided ?  No!

   PLAINTEXT EXAMPLE:  &lt;plaintext&gt;&lt;/plaintext&gt;

A HTML parser will display the characters &quot;&lt;/plaintext&gt;&quot; to the user. Thus it seems to me that if parsing rules is the justification, then &lt;plaintext&gt; must not be used in polyglot documents, as it is not possible  to use it in polyglots, without landing in problems/differences due to conflicting parsing rules. (Exception: &lt;iframe&gt;&lt;plaintext/&gt;&lt;/iframe&gt;. But then we should also say that for example &quot;&lt;p/&gt;&lt;p&gt;&lt;/p&gt;&quot; should be permitted, as it is the same issue: &quot;&lt;p/&gt;&quot; works fine, as long as it is empty and a new block element follows immediately after. Plus that are are outside the syntax what HTML5 permits.
 
(3) For &lt;xmp&gt;, can conflicting parsing rules ever be avoided? Only as long as the author avoids any child element and NCRs. Thus, practically speaking, no! 

   XMP example: &lt;xmp&gt;&lt;p&gt;&amp;#229;&lt;/p&gt;&lt;/xmp&gt;

A HTML-parser will render the content of xmp literally, as code. This is impossible to replicate in XML, unless one uses &lt;[CDATA[ ]]&gt;. However, if one places a  &lt;[CDATA[ ]]&gt; inside, then the parser will render those letters literally as well. 

As for what the specification draft says: Normally one would not say that the XMP example &quot;contains&quot; &quot;&lt;&quot;, &quot;&gt;&quot; or &quot;&amp;&quot;. Instead, it contains a &lt;p&gt; element and a NCR. And it is, eventually, child elements and NCRs that needs to be forbidden inside an xmp element that occurs in a polyglots document.

(4) No need to escape the *characters* &lt;&gt;&amp;. (Needs separate bug too.)

From XML&apos;s point of view, there isn&apos;t anything special with regard to &quot;&lt;&quot;, &quot;&gt;&quot; and &quot;&amp;&quot; inside xmp and plaintext: In all XML documents, the &quot;&lt;&quot; and &quot;&amp;&quot; must - in general -always be escaped. Thus they can neither occur whether inside xmp/plaintext or anywhere else. And, as long as they are escaped, then &quot;&gt;&quot; does not constitute a problem, as far as I can see. Thus, nothing speciall needs to be said about &quot;&lt;&quot; and &quot;&gt;&quot; or &quot;&amp;&quot; inside xmp/plaintext . Instead, it needs to be said aht xmp cannot contain elements or NCRs - see (3) above.

CONCLUSION: Delete the entire section. Or, eventally, say that &lt;plaintext&gt; MUST NOT be used but that &lt;XMP&gt; can be used provided that it has no children and no NCRs.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>45426</commentid>
    <comment_count>1</comment_count>
    <who name="Eliot Graff">eliotgra</who>
    <bug_when>2011-02-12 00:44:29 +0000</bug_when>
    <thetext>In the Editor&apos;s Draft of 11 February 2011, I have deleted section 6.5.2 about &lt;plaintext&gt; and &lt;xmp&gt; in Polyglot Markup, as they are, indeed, deprecated in HTML5.

Thank you so very much for catching this.

Eliot</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>45449</commentid>
    <comment_count>2</comment_count>
    <who name="Leif Halvard Silli">xn--mlform-iua</who>
    <bug_when>2011-02-13 17:58:03 +0000</bug_when>
    <thetext>Fine. Satisified. I believe I should then close this bug.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>53175</commentid>
    <comment_count>3</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2011-08-04 05:07:20 +0000</bug_when>
    <thetext>mass-move component to LC1</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>53208</commentid>
    <comment_count>4</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2011-08-04 05:07:40 +0000</bug_when>
    <thetext>mass-move component to LC1</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>