14734 – [XSLT 2.0] Literal result elements should not be laxly validated

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 14734 - [XSLT 2.0] Literal result elements should not be laxly validated

Summary: [XSLT 2.0] Literal result elements should not be laxly validated

Status:	CLOSED WONTFIX

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	XSLT 2.0 (show other bugs)
Version:	Recommendation
Hardware:	PC All

Importance:	P2 normal
Target Milestone:	---
Assignee:	Michael Kay
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2011-11-08 21:35 UTC by Michael Kay
Modified:	2012-05-16 20:39 UTC (History)
CC List:	2 users (show)

See Also:

Attachments

Description Michael Kay 2011-11-08 21:35:26 UTC

In the XSLT 2.0 schema for stylesheets, literal result elements are validated against a wildcard with processContents="lax". Consider this scenario:

<xsl:template match="/*">
  <book>
    <xsl:if test="@isbn">
      <xsl:attribute name="color" select="'green'"/>
    </xsl:if>
  </book>
</xsl:template>

If there is a global element declaration for 'book', it is very likely that it will not permit an xsl:if element as a child. Performing lax validation of the book element would therefore report a validity error. I believe therefore that literal result elements should be processed using a processContents="skip" wildcard.

(Discovered as a consequence of research into a separate question raised off-list by Norm Walsh)

Comment 1 Michael Kay 2011-11-09 11:24:26 UTC

Of course, using processContents="skip" would have the disadvantage that the xsl:if element (and any other descendants of the LRE) would not be validated. What we really need is a different processContents value that says "skip elements that match the wildcard but process their children as follows: use strict validation if they are in namespace XSLT but recurse otherwise". Not expressible, of course. Interestingly, it's the inverse of the problem we had in XSD of defining validation for xs:documentation sections where we want to validate any XHTML content but not validate any XSD content.

Comment 2 Jirka Kosek 2012-04-04 12:57:30 UTC

I have made more investigation and testing.

Mike conclusions are 100% correct and we can not improve situation if we want to rely only on XML schemas.

Best what we can do is to keep schema as it is and put additional note into spec saying something like:

"During validation only this schema should be used. Supplying schemas for namespaces of literal result elements will likely produce false error messages because of W3C XML Schema limitations."

If we could use NVDL, situation will be much better, we can strip elements in non-XSLT namespaces before validation and we can then use existing XSLT 2.0 schema.

If WG thinks that it is worth we can produce WG Note which will show how to use NVDL for this or how to modify existing schemas to work with XSLT 2.0 schema even when processContents="lax" is used (putting wildcards into schemas for literal result elements).

Comment 3 Sandy Gao 2012-04-05 02:11:50 UTC

> In the XSLT 2.0 schema for stylesheets
> ...
> If there is a global element declaration for 'book' ...

The "schema for stylesheets" clearly would not contain a declaration for "book".

I suppose this can happen if the stylesheet is validated against a schema that imports the XSLT schema? I'm not sure that's a useful thing to do (you can't expect that your "book" in the stylesheet will contain what it normally contains), especially given that the seemingly desired handling here is to "skip" it (i.e. anything that's not in the XSLT schema will be ignored).

If we only consider the case where the stylesheet is validated against the vanilla XSLT schema, then "lax" seems to work fine.

> If we could use NVDL, situation will be much better, we can strip elements in
> non-XSLT namespaces before validation and we can then use existing XSLT 2.0
> schema.

I'm not sure how that differs from validating the stylesheet as is against the vanilla XSLT schema (with "lax"). Hm... there is (at least) one difference. If there is an xsi:type or xsi:nil attribute on an LRE, then that will be taken into account. Is this a significant enough case to worry about?

The one case where I think some of this could be useful is (as MikeK hinted in his email) for syntax-directed editors, where it may be desirable to have both the user schema and the XSLT schema in scope, so that help can be provided for both xsl: elements and LREs. But for this case, "skip"/stripping aren't the right answer for the LREs.

So I think a sensible editor can use the user+XSLT schema for things like content assistance, and leave validation to the vanilla XSLT schema.

Comment 4 Michael Kay 2012-04-05 08:08:05 UTC

>The "schema for stylesheets" clearly would not contain a declaration for "book".

This problem originally came from Norm Walsh and I imagine the context was XProc.

One of the factors might be that Saxon really only allows you to use one schema at a time; this forces you to create a schema containing the union of all the schemas that you really want. Unfortunately lax validation is one of the things that can change its effect if your schema contains random extraneous stuff that you're not really interested in for a particular validation episode. (Another, in XSD 1.1, is notQName="##defined"). Perhaps some of the problems would be avoided if Saxon allowed you to have multiple schemas around at the same time..

Comment 5 Michael Kay 2012-05-16 20:39:35 UTC

There has been considerable discussion on this issue both here and in the XSL WG mailing list. The consensus seems to be that there is nothing we can do given XSD as it exists today. A good point was made by Michael Sperberg McQueen to the effect that literal result elements in a stylesheet are a form of tag abuse: the content model for a <para> element in a stylesheet is not the same as for a <para> element in a normal document, and therefore it is to be expected that when the schema used for validating stylesheets includes the declaration of a <para> element as in appears in a normal document, validation should fail. Nevertheless, lax seems a better option than skip, because it causes validation to recurse through the literal result element to inner XSLT instructions.

I'm therefore closing this (unilaterally, but taking the consensus reached by extensive discussion into account) as WONTFIX.