This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
I've tagged this as XPath for administrative convenience, but it's actually an issue that affects XSLT and XQuery rather than XPath itself. Both XSLT and XQuery define the process of validation in terms of a number of steps, starting with serializing the XDM and then reparsing it to construct an Infoset, which is then validated as described in the XML Schema specification. If this process is followed literally, the base URI of the nodes in the validated document is undefined. This is unfortunate, since it makes any relative URI references in the document unusable. I suggest we add a rule that the base URI of the document node of the Infoset that is constructed must be the same as the base URI of the XDM node being validated. As a future enhancement, there would seem to be a case for allowing the required base URI of a document node to be specified explicitly as an option on a document node constructor (document{} in XQuery, xsl:document in XSLT). It's easy to define syntax for this in XSLT of course, much harder in XQuery.
I see that section 19.2.1.3 of XSLT 2.0[1] defines validation in terms of serialization, so I think this is a valid bug against XSLT. However, section 3.13 of XQuery[2] seems to skip serialization, and uses the mapping of XDM to Infoset directly. I think the base URI will be preserved for XQuery through the entire trip from XDM to Infoset, to PSVI augmentations and back to XDM again. My apologies if I've missed something. [1] http://www.w3.org/TR/xslt20/#validation-process [2] http://www.w3.org/TR/xquery/#id-validate
It's true that XQuery doesn't mention serialization and reparsing directly, but it refers to XDM, presumably section 4, which does. Although rereading this brief section 4, I see that it actually claims to describe two mappings, only one of which uses serialization and reparsing. I imagine that the other is the one described in Appendix K, though that isn't referenced explicitly (and it would be difficult to do so, since it is non-normative). But the mapping in Appendix K does retain the base URI of an element node.
Interesting. I had missed section 4 of XDM, titled "Infoset Mapping" and fixed my sight on sections 6.1.5, 6.2.5, 6.3.5, 6.4.5, 6.5.5, 6.6.5 and 6.7.5 of XDM, each of which is also titled "Infoset Mapping". I took the reference in XQuery 1.0 to be a direct reference to those sections of XDM (summarized non-normatively in appendix K), rather than a reference to section 4 of XDM.
In XQuery, this is quite possibly not broken. I started here: http://www.w3.org/TR/xquery-11/#id-validate The first step is to convert the operand node to an Information Set using the rules found in the Data Model. For elements, this rule preserves the Base URI: <snip from="http://www.w3.org/TR/xpath-datamodel/#const-infoset-element"> Element Node properties are derived from the infoset as follows: base-uri The value of the [base URI] property. Note that the base URI property is always an absolute URI (if an absolute URI can be computed) though it may contain Unicode characters that are not allowed in URIs. These characters, if they occur, are present in the base-uri property and will have to be encoded and escaped by the application to obtain a URI suitable for retrieval, if retrieval is required. </snip> After that, the resulting Infoset is validated as per XML Schema. This does not lose the Base URIs from the Infoset. So I don't think there is a real problem here. Did I get that wrong? Jonathan
(In reply to comment #4) > <snip from="http://www.w3.org/TR/xpath-datamodel/#const-infoset-element"> > Element Node properties are derived from the infoset as follows: > > base-uri > > The value of the [base URI] property. Note that the base URI property is > always an absolute URI (if an absolute URI can be computed) though it may > contain Unicode characters that are not allowed in URIs. These characters, if > they occur, are present in the base-uri property and will have to be encoded > and escaped by the application to obtain a URI suitable for retrieval, if > retrieval is required. > Oh bother, wrong snip. What I meant was this: <snip from="http://www.w3.org/TR/xpath-datamodel/#infoset-mapping-element"> [base URI] The value of dm:base-uri. </snip> But the upshot is the same, the URI is there in the mapped Infoset Elements, and it is preserved when these elements are validated. Jonathan
(In reply to comment #3) > Interesting. I had missed section 4 of XDM, titled "Infoset Mapping" and fixed > my sight on sections 6.1.5, 6.2.5, 6.3.5, 6.4.5, 6.5.5, 6.6.5 and 6.7.5 of XDM, > each of which is also titled "Infoset Mapping". I took the reference in XQuery > 1.0 to be a direct reference to those sections of XDM (summarized > non-normatively in appendix K), rather than a reference to section 4 of XDM. Henry, First off, apologies for not reading your remarks before I responded to Mike Kay. I take Section 4 of XDM to be a forward reference to sections 6.1.5, 6.2.5, 6.3.5, 6.4.5, 6.5.5, 6.6.5 and 6.7.5. Perhaps it should be more explicit about this, but I think the information is there, and I agree with you that the Base URI is preserved. Jonathan
Reclassified as XSLT because the only outstanding parts of the problem are in XSLT.
I propose adding after the numbered list of steps in XSLT 2.0 section 19.2.1.3 the paragraph: <add> The above process must be done in such a way that the base URI property of every node in the resulting XDM tree is the same as the base URI property of the corresponding node in the input tree. Note: As an alternative to steps 1 and 2, the XDM tree may be converted to an Infoset directly, using the mapping rules given for each kind of node in [Data Model](Section 6). </add>
The proposal in comment #8 was accepted at the WG telcon on 12 Nov 2009 and will appear as erratum XT.E38 (drafted).