This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3486 - [XQuery] relative base URI
Summary: [XQuery] relative base URI
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XPath 2.0 (show other bugs)
Version: Candidate Recommendation
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Don Chamberlin
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-07-19 15:51 UTC by David Carlisle
Modified: 2006-09-27 09:54 UTC (History)
1 user (show)

See Also:


Attachments

Description David Carlisle 2006-07-19 15:51:30 UTC
XQuery defines the term "Base URI" to be an absolute URI both in the definition of the ststic context

http://www.w3.org/TR/xquery/#dt-base-uri

and in the glossary

http://www.w3.org/TR/xquery/#GLdt-base-uri

However no error condition or code is described if a non-absolute URI is given as the URILiteral in a Base URI Declaration.

Several other parts of XQuery appear to assume that the base URI is absolute
eg the doc() function

http://www.w3.org/TR/xpath-functions/#func-doc

says ... it is resolved relative to the value of the base URI property from the static context. The resulting absolute URI ...

In the analogous situation in XSLT which uses xml:base to specify a non-default base URI, it is legal to specify a relative URI but it's resolved against the
existing (absolute) URI to ensure that the base URI in the static context is always absolute (or an error is raised if an absolute URI can not be determined)

This affects several XQuery Test Suite tests, including at least base-URI-11 12,13,15,17,18,19,20,21,23,24.

-11 for example expects fn:static-base-uri() to return the relative URI "abc123"
yet fn:static-base-uri is defined to return a Base URI, and Base URI are defined to be absolute.


I would propose that XQuery follow XSLT here and allow the declaration to specify a relative URI but ensure that the base URI in the static context is always absolute.  (Apart from anything else, the definition of the resolution of  relative URI against a base URI is as far as I can see not defined anywhere, the URI RFCs only define the case of an absolute base.)

This would mean that the output of base-URI-11.xq as currently written would be highly system dependent.

the output would be the string
..../abc123
where .... is the initial base URI of the Query text (which may typically be a file::// URI reflecting the location of the query on the filesystem)

David
Comment 1 David Carlisle 2006-07-19 15:56:05 UTC
Oh I should add that my statement "no error condition or code is described" makes some assumptions about the outcome of bug #3485.
Comment 2 Michael Kay 2006-07-19 17:13:47 UTC
See also bug #3415.

I believe that the base URI property of a node must always be an absolute URI (RFC 2396 sort-of-implies this when it says: <quote>The term "relative URI" implies that there exists some absolute "base URI"...</quote>).

However, the value of xml:base can be a relative URI. Section 4.3 of XML Base explains how it is resolved.

XQuery ought to say (a) that in the case where the query is a resource retrieved using a URI, that URI is the "default base URI" of the query; (b) if the prolog declares a base URI and it is relative, then it is resolved against the default base URI. If there isn't a default base URI then it should be an error.

Michael Kay (personal response) 
Comment 3 Michael Kay 2006-08-16 15:42:35 UTC
Here's a detailed proposal (requested by action A-305-08).

Part 1: the static base URI
===========================

In 4.5, delete "overriding any implementation-defined default".

Add at the end of the section: "In the terminology of RFC 3986 (section 5.1), the URILiteral of the base URI declaration is considered to be a "base URI embedded in content". If no base URI declaration is present, the base URI in the static context should be established according to the principles outlined in RFC 3986 section 5.1: that is, it should default first to the base URI of the encapsulating entity, then to the URI used to retrieve the entity, and finally to an implementation-defined default. If the URILiteral in the base URI declaration is a relative URI, then it is made absolute by resolving it with respect to this same hierarchy: for example if the URILiteral is "../data/", and the query is contained in a file whose URI is "file:///C:/temp/queries/query.xq", then the absolute base URI should be taken as "file:///C:/temp/data/".

It is not intrinsically an error if this process fails to establish an absolute base URI; however, the static base URI in the static context is then undefined, and any attempt to use its value may result in an error.

Part 2: xml:base
================

In 3.7.1.3 5.b.i, change "the value of the constructed node's attribute named xml:base, if this attribute exists;" to "if the constructed node has an attribute named xml:base, then the value of this attribute, resolved if it is relative against the base URI in the static context".

In 3.7.3.1, 2nd list, 4.b.i, make the same change.

(END)
Comment 4 David Carlisle 2006-08-16 16:21:19 UTC
Thanks for the reply, A couple of points

the base uri declaration text presumably implies a resolution of bug #3485 that says it's not an error to use a non-absolute URI literal.

The resolution of xml:base is against the static context I now realise that
the XQuery spec is not very clear about how xml:base attributes are supposed to work.

In 3.7.1 (direct element constructors)
there are two cases, xml:base="...." direct attribute constructors, and
attribute {concat('x','ml:base')}{...} computed attribute constructors.


The base-uri is specified in terms of "the constructed node's attribute named xml:base" in item 5.b Perhaps this just applies to direct attribute constructors, as attributes coming from computed attribute constructors in the content are not attached to the element until step 5.d.

However in 3.7.3.1 (computed element constructors) there is not the possibility of direct attribute syntax, just computed (or copied) attributes in the content sequence, however the same text appears, again base-uri is assigned in step b but attributes are not attached until step d



So I think that I suggest that in both cases base-uri assignment is moved after attribute assignment, and that it is highlighted more strongly that resolution is against the base uri in the static context rather than the (dynamic property) 
the base uri of the element to which this element is being attached.

In the case of direct attribute constructors xml:base="..." there would be the possibility (similar to the namespace case) that this also affects the base URI in the static context within the scope of the element constructor. I'm not sure
that (today) I have a strong feeling whether or not it should affect the static context in ths way, but it would be helpful if the spec said explictly whether it did or did not.

David
 



To be more explict, what's the expected answer to this:


declare base-uri "http://a/b/c";

element a {
 attribute xml:base {"../x"},
element b {
 attribute xml:base {"y"}
}}/b/base-uri()
Comment 5 Michael Kay 2006-08-16 17:29:42 UTC
Firstly, there's no implication anywhere in the spec that xml:base in a direct element constructor should affect the static context of the query. I agree that because XQuery looks a bit like XML, some people might assume that it behaves like XML, so it wouldn't do any harm to make this explicit. But I think it's right that it shouldn't affect the static base URI.

I agree that in both 3.7.1.3 and 3.7.3.1 it would be tidier if computing the base URI of the new element were done after computing its attributes property, that is, move rule (b) to somewhere after rule (d).

I don't think any other change is needed: my proposed wording makes it clear, I think, how a relative base URI is resolved.
Comment 6 Michael Rys 2006-08-16 18:40:20 UTC
None of the xml attributes affect the static context during construction.

Best regards
Michael
Comment 7 David Carlisle 2006-08-16 20:07:27 UTC
(In reply to comment #6)
> None of the xml attributes affect the static context during construction.

Yes, I think that's (probably:-) the right design, and anyway I don't suggest changing it, but I do think that in this case it wouldn't harm to say it explictly in the case of xml:base.
<a xml:base="http://a/b">
  <b xml:base="c"/>
</a>

is legal syntax both as XML and XQuery, but if the nodes are generated by an
XML parser that supports xml base, then the base uri of <b/> is
http://a/c but if the nodes are generated by an XQuery processor then
(according to comment #3) the base uri of <b/> will be the absolute URI resulting from resolving "c" against the base URI of the static context of the Query.
This is justifiable but I think it wouldn't harm if the spec explictly gave such an example, as it may surprise people used to the way xml base works in XML (and in XSLT2).

So am I right that comment #3 implies that the answer to the question in comment #4 is

http://a/b/y

(I just want confirmation that I understand the proposal before agreeing to it:-)


David
Comment 8 Michael Kay 2006-08-17 11:22:49 UTC
Considering this query (Q1):

<a xml:base="http://a/b">
  <b xml:base="c"/>
</a>

I was actually under the impression that it behaved the same as this (Q2):

<a xml:base="http://a/b">
  {<b xml:base="c"/>}
</a>

but a close reading of the text shows that it doesn't. In particular, looking at section 3.7.1.3, Q1 is covered by rule 1.d, whereas Q2 is covered by rule 1.e, and in particular rule 1.e.ii.E. The effect of rule 1.e.ii.E is that the final base URI of element <b> in Q2 is http://a/c, which I think is the correct result. I believe this should be the result for Q1 also, and propose to fix this by adding a 3rd part to my proposal:

Part 3: base-uri in an enclosed direct element constructor 
==========================================================

Add to rule 1.d of 3.7.1.3 the sentence: "The base-uri property is set to be the same as that of its new parent, unless it (the child node) has an xml:base attribute, in which case its base-uri property is set to the value of that attribute, resolved (if it is relative) against the base-uri property of the new parent node."

(There's scope for editorial improvement here).
Comment 9 Michael Kay 2006-08-17 12:27:34 UTC
Unfortunately the proposal in comment #8 doesn't do the right thing for

<a xml:base="http://a/b">
  <b xml:base="c">
    <c/>
  </b>
</a>

in that the c element ends up with a different base URI from the b element. Here's another attempt:

Part 3: base-uri in an enclosed direct element constructor 
==========================================================

Add to rule 1.d of 3.7.1.3 the sentence: "The base-uri property of the resulting node, and of each of its descendants, is set to be the same as that of its new parent, unless it (the child node) has an xml:base attribute, in which case its base-uri property is set to the value of that attribute, resolved (if it is relative) against the base-uri property of its new parent node."
Comment 10 Frans Englich 2006-09-04 18:42:44 UTC
Just mentioning that I've also independently run into the problem of relative xml:base attributes in direct and computed constructors, which the proposed solution "Part 2: xml:base" in comment #3 solves in my opinion as well.

(The XQTS is missing tests for these scenarios; relative xml:base attributes in "top" constructors. Tests for the differences compared to XML/XSL-T as David describes is needed as well.)


Frans
Comment 11 Don Chamberlin 2006-09-12 23:56:05 UTC
David,
In a meeting on Aug. 25, 2006, the Query Working Group decided to resolve this issue by accepting the proposal of Michael Kay, documented in Comment #3 (Parts 1 and 2) and Comment #9 (Part 3). These changes will appear in the next version of the XQuery specification. If you are satisfied with this resolution, please change the status of this Bugzilla entry to "Closed".
Regards,
Don Chamberlin (for the Query Working Group)
Comment 12 Frans Englich 2006-09-13 10:12:20 UTC
But a relative base URI is only an error when the base URI declaration is relative. If the base URI is initialized(by the environment) to be relative, it's accepted and that make the "The resulting absolute URI Reference" claim for fn:doc(and any other assumption on absolute base URI) not hold.

Wouldn't it be easier if it just simply was an error if the base URI is relative? It would cover all cases, as I see it. If not, the specs must be edited to not assume the base URI is absolute, as I see it.
Comment 13 David Carlisle 2006-09-13 12:30:48 UTC
(In reply to comment #12)
> But a relative base URI is only an error when the base URI declaration is
> relative. 

I'm not sure what you mean by that, the base-uri in the static context should always be absolute (it should be an error, or undefined if the system defines it otherwise) However even if the base URI is absolute, specifying a relative uri (in any context) may generate an error if the base URI is not hierarchic, data:,foo for example is an absolute URI that ought to generate an error
if a relative URI is used pretty much anywhere.

> If the base URI is initialized (by the environment) to be relative,
> it's accepted 

that I think should be an error. As defined in the specs, the base uri property is an _absolute_ uri (see the references in comment #0). The clarification here is to allow the uriliteral to be relative, not to allow the underlying property to be relative.

> and that make the "The resulting absolute URI Reference" claim
> for fn:doc(and any other assumption on absolute base URI) not hold.

As I note above it may be that an attempt to form an absolute uri from a relative uri reference will fail even if the base uri is absolute.
comment #3 says that this "is an error" (although it doesn't assign a code to it)

> 
> Wouldn't it be easier if it just simply was an error if the base URI is
> relative? 
I think that it has to be the case that it is an error if the base uri is relative but that can only happen if it is set from the external environment.
a relative uri literal should be allowed (and is allowed by the WG's proposed resolution of this entry) but would bever result in a relative URI being the base uri in the static context.

> It would cover all cases, as I see it. If not, the specs must be
> edited to not assume the base URI is absolute, as I see it.
> 
No, the specs should enforce that the base uri is absolute, but allow relative uri literals.

David
Comment 14 Frans Englich 2006-09-25 16:20:24 UTC
Yes, David and I seem to be in agreement on the current state(comments #12 and #13). Summarized:

   It is currently possible to make the static base
   URI relative by setting it as so from the external
   environment. This makes all assumptions(references in #0)
   on that the base URI is absolute not hold.

Therefore, there are two possible approaches for this: 1) Either re-write all relevant sections to not assume the base URI is absolute; or 2) Make it an error if the base URI is not absolute.

I think the latter is the correct solution. Proposed fix:

In 2.1.1 Static Context(XPath20/XQuery10) change:

"[Definition: Base URI. This is an absolute URI, used when necessary in the resolution of relative URIs (for example, by the fn:resolve-uri function.)] The URI value is whitespace normalized according to the rules for the xs:anyURI type in [XML Schema]."

by appending a sentence, such that it becomes:

"[Definition: Base URI. This is an absolute URI, used when necessary in the resolution of relative URIs (for example, by the fn:resolve-uri function.)] The URI value is whitespace normalized according to the rules for the xs:anyURI type in [XML Schema]. It is a static error if the URI is not absolute[new XPST* error code]."

This can affect XSL-T, I'm not the one to tell.
Comment 15 Frans Englich 2006-09-25 16:49:48 UTC
I think it's clearer with "It is a static error if the URI is relative[new XPST* error code]."('relative' instead of 'not absolute').
Comment 16 Michael Kay 2006-09-25 17:24:35 UTC
>This can affect XSL-T, I'm not the one to tell.

In XSLT the static base URI is the base URI of a node in the stylesheet document, so if the base URI of a node is always absolute then we don't have a problem.

Michael Kay
Comment 17 Don Chamberlin 2006-09-26 17:12:53 UTC
On Sept. 26, 2006, the Query and XSLT working groups considered this issue again. 

We believe that Section 2.1.1, which defines the static context, is not the correct place to introduce error codes. This section currently defines many static context items, but has no error codes. By definition, the "base URI" item in the static context is an absolute URI. It cannot be anything else. The static context does not specify an error code that applies if "statically known documents" has the value 47, or if "ordering mode" has the value "Penguin". These values simply do not correspond to the definitions of their items.

The correct place to introduce error codes is in connection with language constructs that might cause an error. In the case of Base URI, this language construct is the Base URI Declaration, described in XQuery Section 4.5. This section describes in detail how the "base URI" property of the static context is determined. Here is a quote from this section (in an internal working draft that has not yet been published):

"In the terminology of [RFC3986] Section 5.1, the URILiteral of the base URI declaration is considered to be a 'base URI embedded in content'. If no base URI declaration is present, the base URI in the static context is established according to the principles outlined in [RFC3986] Section 5.1that is, it defaults first to the base URI of the encapsulating entity, then to the URI used to retrieve the entity, and finally to an implementation-defined default. If the URILiteral in the base URI declaration is a relative URI, then it is made absolute by resolving it with respect to this same hierarchy. For example, if the URILiteral in the base URI declaration is ../data/, and the query is contained in a file whose URI is file:///C:/temp/queries/query.xq, then the base URI in the static context is file:///C:/temp/data/. It is not intrinsically an error if this process fails to establish an absolute base URI; however, the base URI in the static context is then undefined, and any attempt to use its value may result in an error [err:XPST0001]."

The working groups believe that this paragraph defines the error behavior associated with a relative base URI, in the correct context, with a specific error code.

It may be noted that this paragraph appears only in the XQuery specification and not in the XPath specification. This is because it is the responsibility of the host language (XQuery in this case) to specify how the base URI property of the static context is determined. For the XPath specification, it is sufficient to state in the definition of the base URI property that it is an absolute URI.

Frans, if you agree with the working groups' analysis of this bug, will you please change its status to "Closed"?

Regards,
Don Chamberlin (for the Query and XSLT working groups)
Comment 18 David Carlisle 2006-09-27 08:43:12 UTC
Don, the quoted passage looks good to me, thanks.

David
Comment 19 Frans Englich 2006-09-27 09:54:47 UTC
Yes, I as well think this nails the last hole. Thanks for the thorough rationale, it is surely appreciated when not being able to participate in the calls.

(I was aware that the Context sections specify no errors and the part of my proposal that did introduce it, itched me. But considering that context properties are "global" and that when it is set from the external environment, it's not via the base URI declaration. That's how it ended up there when I wrote it.)