This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
The draft text on plaintext and xmp should be deleted: ]] Due to the conflict between parsing rules between HTML and XML, polyglot markup uses the following elements only if they do not contain angled brackets ("<" or ">") or ampersands ("&").[[ ISSUES: (1) plaintext/xmp are forbidden in HTML5 - so how do they belong in this draft? (Needs separate bug too.) According to Henri Sivonnen, the Polyglot spec should only describe a subset of XML1 and HTML5. But which subset? Is it about the valid subset? or the valid and well-formed subset? Or perhaps about the DOM equal subset? Or the valid and well-formed DOM equal subset? Example: When you say that polyglot markup *requires* <colgroup/>, then we are outside both validity and well-formedness - then we are in the "equality" land. And the same goes for <xmp> and <plaintext> - the emphasis, as long as you discuss them at all, is on equality, and not on whether validity or well-formedness. This question requires a separate bug. But I want to mention it here anyhow. In my view, Polyglot Markup should describe the HTML5-valid (and perhaps also XML 1.0-valid), XML 1.0-well-formed, DOM-equal subset of HTML5. For that reason, plaintext and xmp does not belong in Polyglot Markup, as it is not permitted in HTML5. (2) For <plaintext>, can conflicting parsing rules ever be avoided ? No! PLAINTEXT EXAMPLE: <plaintext></plaintext> A HTML parser will display the characters "</plaintext>" to the user. Thus it seems to me that if parsing rules is the justification, then <plaintext> must not be used in polyglot documents, as it is not possible to use it in polyglots, without landing in problems/differences due to conflicting parsing rules. (Exception: <iframe><plaintext/></iframe>. But then we should also say that for example "<p/><p></p>" should be permitted, as it is the same issue: "<p/>" works fine, as long as it is empty and a new block element follows immediately after. Plus that are are outside the syntax what HTML5 permits. (3) For <xmp>, can conflicting parsing rules ever be avoided? Only as long as the author avoids any child element and NCRs. Thus, practically speaking, no! XMP example: <xmp><p>å</p></xmp> A HTML-parser will render the content of xmp literally, as code. This is impossible to replicate in XML, unless one uses <[CDATA[ ]]>. However, if one places a <[CDATA[ ]]> inside, then the parser will render those letters literally as well. As for what the specification draft says: Normally one would not say that the XMP example "contains" "<", ">" or "&". Instead, it contains a <p> element and a NCR. And it is, eventually, child elements and NCRs that needs to be forbidden inside an xmp element that occurs in a polyglots document. (4) No need to escape the *characters* <>&. (Needs separate bug too.) From XML's point of view, there isn't anything special with regard to "<", ">" and "&" inside xmp and plaintext: In all XML documents, the "<" and "&" must - in general -always be escaped. Thus they can neither occur whether inside xmp/plaintext or anywhere else. And, as long as they are escaped, then ">" does not constitute a problem, as far as I can see. Thus, nothing speciall needs to be said about "<" and ">" or "&" inside xmp/plaintext . Instead, it needs to be said aht xmp cannot contain elements or NCRs - see (3) above. CONCLUSION: Delete the entire section. Or, eventally, say that <plaintext> MUST NOT be used but that <XMP> can be used provided that it has no children and no NCRs.
In the Editor's Draft of 11 February 2011, I have deleted section 6.5.2 about <plaintext> and <xmp> in Polyglot Markup, as they are, indeed, deprecated in HTML5. Thank you so very much for catching this. Eliot
Fine. Satisified. I believe I should then close this bug.
mass-move component to LC1