This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Consider the following stylesheet. <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:output method="html" /> <xsl:template match="/"> <html> <body> <my:p title="<" xmlns:my="http://example.org">xml island</my:p> <p title="<">not an xml island</p> </body> </html> </xsl:template> </xsl:stylesheet> The first paragraph of section 7.1 of Serialization says, "An element whose expanded QName has a non-null namespace URI MUST be output as XML. This is known as an XML Island."[1] The third item in the numbered list then says, "the generic rules for the HTML output method that apply to all elements and attributes, for example the rules for escaping special characters in the text and the rules for indentation, MUST be used also for namespaced elements and attributes." Then section 7.2 says, "The HTML output method MUST NOT escape "<" characters occurring in attribute values." So, should the serialized result be <html> <body> <my:p title="<" xmlns:my="http://example.org">xml island</my:p> <p title="<">not an xml island</p> </body> </html> or <html> <body> <my:p title="<" xmlns:my="http://example.org">xml island</my:p> <p title="<">not an xml island</p> </body> </html> The first requirement (that my:p be serialized "as XML") leads me to expect that < will be escaped, yielding the first result; the second and third requirements lead me to expect that < will not be escaped, yielding the second result. [1] http://www.w3.org/TR/xslt-xquery-serialization/#HTML_MARKUP [2] http://www.w3.org/TR/xslt-xquery-serialization/#HTML_ATTRIBS
The only thing that I can think is that the sentences, "An element whose expanded QName has a non-null namespace URI MUST be output as XML. This is known as an XML Island," were intended to mean that such elements are never recognized as HTML elements, even if their local names happen to be the same as those of HTML elements, and that the second result is the correct serialized result.
I wrote "the second result" in comment #1 where I meant "the first result."
The best that I've been able to determine is that the sentences "An element whose expanded QName has a non-null namespace URI MUST be output as XML. This is known as an XML Island," were meant to be a statement of intent, and that the rules that follow describe the actual rules in detail. The behaviour of implementations that I've tested seem to reflect that. I propose changing those sentences to read "An element whose expanded QName has a non-null namespace URI might be serialized differently from an element that is in no namespace. An element that has a non-null namespace is known as an XML Island."
Liam points out that an XML island is the piece of the serialized HTML result that is formatted as XML - the element node itself is not an XML island. My proposed rewording in comment #3 implied the latter. Taking that into account, I propose changing those sentences to read, "An element whose expanded QName has a non-null namespace URI might be serialized differently from an element that is in no namespace. The portion of the serialized document representing the result of serializing such an element is known as an XML Island."
At the joint telecon of the XSLT and XQuery Working Groups,[3] the proposals of comment#4 were accepted, with suitable editorial reworking to eliminate the use of the word "might". This will be erratum SE.E20. The editor's final revised wording is to change the sentences in question to read, "As is described in detail below, the HTML output method will not output an element differently from the XML output method unless the expanded QName of the element has a null namespace URI. [Definition] The portion of the serialized document representing the result of serializing an element whose expanded QName does not have a null namespace URI is known as an XML Island." [3] http://lists.w3.org/Archives/Member/w3c-xsl-query/2011Sep/0250.html (Member-only link)