This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 18460 - Need to violate XSLT spec to correctly produce <br>
Summary: Need to violate XSLT spec to correctly produce <br>
Status: NEW
Alias: None
Product: WHATWG
Classification: Unclassified
Component: Unwelcome (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: Unsorted
Assignee: Michael[tm] Smith
QA Contact: sideshowbarker+unwelcome
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-08-01 16:26 UTC by Alexey Proskuryakov
Modified: 2013-05-31 12:45 UTC (History)
7 users (show)

See Also:


Attachments

Description Alexey Proskuryakov 2012-08-01 16:26:29 UTC
XSLT spec says: "The html output method should not output an element differently from the xml output method unless the expanded-name of the element has a null namespace URI"

Since all HTML elements are in XHTML namespace, this means that <br> must be output as <br></br>, which is of course not what we want in html output.

There is already a related intentional violation of XSLT in HTML, but my understanding is that it only makes things worse in this regard.

References:
https://bugs.webkit.org/show_bug.cgi?id=76707
https://bugzilla.gnome.org/show_bug.cgi?id=651925
http://www.w3.org/TR/xslt/#section-HTML-Output-Method
http://www.w3.org/TR/html5/interactions-with-xpath-and-xslt.html#interactions-with-xpath-and-xslt
Comment 1 Ian 'Hixie' Hickson 2012-09-18 23:56:41 UTC
Not just <br>, presumably; any void element.
Comment 2 Peter Ryan 2012-09-25 15:41:20 UTC
Would a pragmatic approach be to apply this part of the XSLT spec[1]:

"The html output method should not output an end-tag for empty elements. For HTML 4.0, the empty elements are area, base, basefont, br, col, frame, hr, img, input, isindex, link, meta and param. For example, an element written as <br/> or <br></br> in the stylesheet should be output as <br>."

...but also apply it to (X)HTML namespaces (or indeed, any namespace?).

For example, an element with a non-null namespace, with a local-name of "br" must be output as XML (per the spec), but there's nothing preventing the above rule also being applied, resulting in the output as <br/>.

[1] http://www.w3.org/TR/xslt#section-HTML-Output-Method
Comment 3 Simon Pieters 2012-09-26 07:04:04 UTC
The XSLT "html" output method is inappropriate for today's HTML for more reasons than void elements. If you're going to fix it, why not fix it properly?

Also see http://about.validator.nu/apidoc/nu/validator/htmlparser/tools/XSLT4HTML5.html
Comment 4 Henri Sivonen 2012-09-26 08:12:09 UTC
IIRC, the XSLT WG was making fixing the html output method or creating a new HTML5-aware output method.
Comment 5 Ian 'Hixie' Hickson 2012-12-02 03:49:19 UTC
So what's the requested fix to the HTML spec here?
Comment 6 Simon Pieters 2012-12-03 11:03:17 UTC
Some things to consider:

Void elements.
SVG and MathML and namespaced attributes.
RCDATA elements.
RAWTEXT elements.
CDATA sections?
script elements.
Nice to have: Ability to output the short doctype without using "emit these characters here" feature (which doesn't work in impls that transform the DOM directly instead of serialize then parse).
Namespace declarations.
Elements in no namespace (I think should end up in the HTML namespace?)
<pre>\n (and textarea and listing)
Comment 7 Ian 'Hixie' Hickson 2012-12-07 23:14:45 UTC
Are these things to consider for changes to XSLT, or HTML?
Comment 8 Ian 'Hixie' Hickson 2012-12-31 05:16:04 UTC
I'm at a loss as to what is being requested here.
Comment 9 Ian 'Hixie' Hickson 2013-03-07 22:19:48 UTC
Upon further discussion, I think the best course of action here would be for the HTML spec to remove its monkeypatching of XSLT, and let the XSLT spec just be updated to work with modern text/html, if there is still interest in doing that.
Comment 10 Henri Sivonen 2013-03-25 14:24:36 UTC
The thing is that there does not appear to be interest in speccing revisions to XSLT 1.0 and there does not appear to be browser interest in implementing later versions of XSLT. Hence, the need to spec a delta from XSLT 1.0 somewhere.
Comment 11 Ian 'Hixie' Hickson 2013-03-26 18:24:32 UTC
Anyone interested in writing a spec that defines that?
Comment 12 Ian 'Hixie' Hickson 2013-05-04 00:15:11 UTC
I don't have the XSLT domain knowledge for this, so I don't know how to proceed.

The HTML spec has some normative and some non-normative comments about how to fix XSLT, but I don't really understand them. I'm happy to add more, or make the non-normative text normative if it can be made comprehensive, but I don't think it's worth my time to learn XSLT to do this, since XSLT doesn't seem to be the future.

I'm moving this to the "Unwelcome" component; please feel free to reassign it to me if you have text for me to write (or very specific instructions about what the text should say; I don't mind writing it if someone can tell me what it should say and work with me to get it to a good state.)
Comment 13 Ian 'Hixie' Hickson 2013-05-30 22:31:44 UTC
See bug 17976.
Comment 14 Jirka Kosek 2013-05-31 12:45:10 UTC
I think that this bug can be actually closed. Problem reported for serialization of <br> can be easily solved:

a) In XSLT 1.0 use HTML output method and write stylesheets in a way that HTML elements are not in any namespace.

b) In XSLT 2.0 use either a) or use XHTML output method and write stylesheets that emit HTML elements in http://www.w3.org/1999/xhtml namespace

3) In XSLT 3.0 there is a new serialization method which should be fully "HTML5-aware": http://www.w3.org/TR/xslt-xquery-serialization-30/#html-output

If there is anything missing or wrong in the XSLT 3.0 HTML serialization please report it to address in the document, XSLT WG would be happy to fix it.

Jirka