This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 5444 - [SER] Non-XHTML elements with XHTML output method
Summary: [SER] Non-XHTML elements with XHTML output method
Status: CLOSED INVALID
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Serialization 1.0 (show other bugs)
Version: Recommendation
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Henry Zongaro
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-31 15:42 UTC by Tim Mills
Modified: 2009-06-16 15:25 UTC (History)
1 user (show)

See Also:


Attachments
Test case for legacy browser handling of br and unknown namespaced tags (302 bytes, text/html)
2009-02-05 19:12 UTC, C. M. Sperberg-McQueen
Details

Description Tim Mills 2008-01-31 15:42:08 UTC
The description of the HTML output method describes how elements not defined in HTML (but in the null namespace) and elements in non-null namespaces should be handled.  There doesn't appear to be a corresponding description for handling how elements not defined in XHTML but in the XHTML namespace and elements in non-XHTML namespaces should be treated.

Specifically, how should they be treated with regards to indentation, and how should empty elements be rendered (<foo:empty/>, <foo:empty /> or <foo:empty></foo:empty>.
Comment 1 Michael Kay 2008-01-31 16:04:11 UTC
It seems to me to be covered by this sentence near the start of section 6:

The serialization of the instance of the data model follows the same rules as for the XML output method, with the general exceptions noted below 
Comment 2 Tim Mills 2008-01-31 16:26:13 UTC
That sentence finishes:

"...and parameter-specific exceptions in 6.1 The Influence of Serialization Parameters upon the XHTML Output Method. "

Should I be reading 6.1.3 (XHTML Output Method: the indent Parameter) as a full description of how indentation is to be handled for this output method, or does it have to be read in conjunction with "5.1.3 XML Output Method: the indent Parameter".  For example, if xml:space is encountered on an XHTML tag, should it be respected?

"Given an XHTML element whose content model is EMPTY, the serializer MUST ... MUST include a space before the trailing />"

It seems strange to mandate writing <br /> but not forbidding <foo:br/>, although I admit to having not tried this in an old HTML user agent!

Comment 3 Henry Zongaro 2008-03-31 13:48:36 UTC
I believe the intent was that the descriptions of the effects of the serialization parameters on the xhtml output method should be as defined in the various subsections of section 6 of the recommendation.  In many cases, the effect of the parameter is identical to that of the xml output method, and in those cases the recommendation refers back to the definition provided for the xml output method.  In the case of the indent parameter, there is no reference back to the definition of the effect of the indent parameter on the xml output method.  I take that as a signal that the effect of the indent parameter on the xml output method has no bearing.

Regarding the serialization of an element like foo:br, the second paragraph of section 6 says, "It is not an error if the instance of the data model is invalid XHTML."  The recommendation allows for the possibility that a user might be mixing xhtml with ordinary xml for their own purposesn - perhaps sending it through some post-processing phase.  An HTML user agent is unlikely to do anything useful with that in terms of presentation, so the recommendation leaves it entirely up to the serializer how it should be serialized, so long as it meets the requirements of the recommendation.

This is my personal response, not that of the XSL or XQuery working groups.
Comment 4 C. M. Sperberg-McQueen 2009-02-05 16:59:14 UTC
If the goal of the XHTML output method is to try to help make
the output displayable with minimal pain in legacy HTML
browsers (or current HTML browsers which insist on going 
into quirks mode whether you want them to or not), then
I think the natural extension of that goal to mixed-namespace
documents and to documents with unknown elements in the
XHTML namespace is to try to make them behave as well
as possible for legacy browsers.

If so, then the original poster has a point w.r.t. empty-element
format, and it would probably be wise to mandate either
a start-end tag pair or a blank before the "/>" closing 
delimiter.

It's true, as Henry points out in comment 3, that a legacy
HTML browser is unlikely to be doing anything very helpful
or exciting with namespaced material not in the HTML
namespace.  But the rule "ignore all tags you do not 
understand", while not a complete solution, has stood
browsers, users, and those wishing to introduce new markup
in good stead.  And no matter what, I think inducing the
old browsers to ignore the tags completely is going to be
more useful than inducing them to display "/>" or ">"
in the text.
Comment 5 Henry Zongaro 2009-02-05 17:52:58 UTC
At its teleconference of 2009-02-05, the XSL WG considered this problem and decided that the Serialization recommendation was clear as it stood, and agreed the bug should be closed without change.

However, the WG expressed sympathy for the points that Michael Sperberg-McQueen raised in comment 4.  The working group requested that he submit a request for enhancement for a future version of the Serialization recommendation.

XQuery WG consideration of the bug is still pending.
Comment 6 C. M. Sperberg-McQueen 2009-02-05 19:03:58 UTC
A little further investigation shows that the view I espoused in
comment #4 was ill-informed.  I do think the goal of making
legacy browsers ignore namespaced elements is worth trying to
achieve.  But as far as I can tell, both current and legacy
browsers achieve the "ignore tags you don't understand" goal
under the current rules; it is (as far as I can tell) not
necessary to make special provision forbidding serialization of
unknown empty elements as <foo/> or <foo:bar/>; both forms are
treated just like <foo /> and <foo></foo>, i.e. both are
successfully ignored.

I had thought that the form <foo/> would cause legacy browsers,
or legacy mode in some current browsers, to treat / or /> or > as
content characters; this appears not to be the case.  Tests on
current-ish versions of Safari, Firefox, and Opera, and also on
Netscape Navigator 4.79 and IE 5.5 (the oldest browsers anyone I
could find on the spur of the moment had access to) show no / or
> anywhere in the display, for any of <br>, <br/>, <br />,
<br></br>.

If I now understand correctly, the blank in <br /> is required
for legacy browsers not because otherwise / or /> gets displayed,
but because <br/> has no effect (i.e. the element is ignored,
presumably because the legacy parser thinks it has found an
element named "br/").

During the WG call this morning I had agreed to file a request
for enhancement suggesting that HTML and XHTML modes provide
special rules for empty unknown elements.  That no longer seems
necessary or sensible.

Comment 7 C. M. Sperberg-McQueen 2009-02-05 19:12:05 UTC
Created attachment 625 [details]
Test case for legacy browser handling of br and unknown namespaced tags

Since this is the sort of topic that can evoke mildly obsesssive curiosity
about how various versions of old browsers treat different variations in
the source, and since others who consult this bug record may have access
to a wider range of legacy browsers than the group of friends I accosted just
now, I am attaching the simple test case I constructed to test the handling
of <br>, <br/>, <br />, <br></br>, <foo:bar>, <foo:bar/>, <foo:bar />,
and <foo:bar></foo:bar>.  This may at least allow some of my fellow
obsessives to satisfy their curiosity on the matter with less up-front
investment of time spent constructing a test case.
Comment 8 Henry Zongaro 2009-03-09 14:13:34 UTC
At the face-to-face meeting of 2009-02-23 to 2009-02-25, the XQuery and XSL working groups reviewed this bug report, and concurred with the decision of the XSL WG recorded in comment 5 - that the Serialization recommendation was clear that elements in some namespace other than the XHTML namespace should be handled in a way that is consistent with the requirements of the XML output method, and that no change was necessary.

It was noted that the serialization recommendation does not prohibit an empty element in some non-XHTML namespace from being serialized as an empty element tag with a space before the "/>", but neither does it require it.

Tim, I've returned this bug report.  If you agree with the resolution, may I ask you to close the report?
Comment 9 Henry Zongaro 2009-06-16 15:25:54 UTC
We never heard back from Tim.  I will assume it's acceptable to close this bug report.