This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
H.2 of the data model says (for the dm:base-uri accessor on element nodes): "Returns the value of the base-uri property if it exists and is not empty. Otherwise, if the element has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns the empty sequence." I think this makes it unclear whether we are modelling base URI as if it were a property physically stored with every node, or as something that is computed on demand. Most of the time we treat it as if it were a stored property, whose value is always an absolute URI. However, we should be sympathetic to implementations that actually compute it on demand, since there is otherwise a great deal of redundancy. I'll ignore the apparent distinction between a non-existent property and an empty property. The question is, should we allow this inheritance algorithm to be invoked at retrieval time, given that all the well-known ways of constructing an element node actually compute the value of the base-URI property at the time the element is created? For example, when we construct from an InfoSet, we take the base URI property of an element from the [base uri] property of the element information item, and the InfoSet says that this property is computed according to the rules in XML Base. I think that the rule in XDM that makes the accessor walk up the tree is wrong. Real implementations might behave this way, but we should model it as if base URI is a stored property, with any inheritance being done at node construction time. On the same subject, I believe that the rules, certainly for construction of XDM from an Infoset, have the consequences that (a) the base URI property if it exists is always an absolute URI, and (b) the computation of the base URI property from the actual value of the xml:base attribute includes percent-escaping of special characters. It would be useful to make this clear, since it otherwise requires a complex paper-chase to discover this information.
Concerning (b) in the last paragraph of comment #1, I now see that the Infoset states under the unnumbered heading "Base URIs" in section 1, that the value of the property "... does not reflect any URI escaping that may be required for retrieval of the resource". So perhaps my theory that base URIs must be percent-escaped is wrong after all. Either way, it would be good to have clarification.
Proposal for resolving bug #3415 Change the description of dm:base-uri in 6.1.2, 6.2.2, and 6.5.2 to read: dm:base-uri Returns the value of the base-uri property. N.B. leave the description of dm:base-uri unchanged in 6.3.2, 6.6.2, and 6.7.2. Change the description of base-uri in 6.1.3, 6.2.3, and 6.5.3 to read: base-uri The value of the [base URI] property. Note that the base URI property is always an absolute URI (if an absolute URI can be computed) though it may contain Unicode characters that are not allowed in URIs. Any such characters must be encoded and escaped to obtain a URI suitable for retrieval, if retrieval is required. By my reading of XML Base and the Infoset spec, the [base URI] property is always absolute, except for the case where no absolute URI can be computed.
Note to self: Tweak this to make sure that it's clear that the escaping *isn't* performed