This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 7239 - [XQTS] New test for typed value of potentially namespace sensitive XDM nodes when copied with different in scope namespaces..
Summary: [XQTS] New test for typed value of potentially namespace sensitive XDM nodes ...
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Data Model 3.0 (show other bugs)
Version: Recommendation
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Norman Walsh
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-08-07 10:56 UTC by Oliver Hallam
Modified: 2010-02-18 18:05 UTC (History)
3 users (show)

See Also:


Attachments

Description Oliver Hallam 2009-08-07 10:56:11 UTC
Our XDM node representation consisted of storing the (static) schema type of a node (from validation, or copying) and the string value for simple typed elements.  The typed value of the node is computed from the string value with the schema type.  

I am not sure if it was an intention of the specifications to allow this representation.  The specification makes no claims that this should be equivalent, but for example the XQST0086 error seems to prevent the other cases where this representation would not be safe.  

In order to illustrate the problem, I propose that this case should be added to XQTS as a test.

First off we need to have a schema:

<xs:schema targetNamespace="http://www.xqsharp.com/test/namespace-sensitive"
           xmlns="http://www.xqsharp.com/test/namespace-sensitive"
           xmlns:xs="http://www.w3.org/2001/XMLSchema"
           elementFormDefault="qualified">

  <xs:simpleType name="myString">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>

  <xs:simpleType name="union">
    <xs:union memberTypes="xs:QName myString" />
  </xs:simpleType>

  <xs:simpleType name="list">
    <xs:list itemType="union" />
  </xs:simpleType>

  <xs:element name="root" type="list" />
  
</xs:schema>


Next we need a source document validated against this schema:

<root xmlns="http://www.xqsharp.com/test/namespace-sensitive"
      xmlns:foo="http://www.example.org/foo/">
  foo:test bar:test
</root>


Finally we need the following query which imports the document as $input-context

declare construction preserve;
declare copy-namespaces no-preserve,inherit;
import schema namespace ns="http://www.xqsharp.com/test/namespace-sensitive";

declare variable $input-context external;

let $node := <e xmlns:bar="http://www.example.org/bar">{$input-context/ns:root}</e>
return data($node/ns:root)[2] instance of ns:myString


$node from the query looks something like this:

<e xmlns:bar="http://www.example.org/bar">
  <root xmlns:foo="http://www.example.org/foo">
    foo:test bar:test
  </root>
</e>

Note that the schema type of the root element of the source document is a list of unions of QNames and myStrings, and the typed value is a QName followed by a myString.  The typed value of the root node should have been copied, and so should also be a QName followed by a myString.  Thus the query should return true.  

However, if the node representation stores just the schema type (a list of QNames and myStrings) and the value ("foo:test bar:test") then the typed value is computed as two QNames and the query returns false.
Comment 1 Michael Kay 2009-08-09 13:51:51 UTC
>Our XDM node representation consisted of storing the (static) schema type of a node (from validation, or copying) and the string value for simple typed elements.  The typed value of the node is computed from the string value with the schema type.  I am not sure if it was an intention of the specifications to allow this representation.

Yes it's definitely intended that that should be a possible implementation. It's the approach Saxon uses (though Saxon 9.2 by default caches the typed value as well, computing it lazily on first use).

>the XQST0086 error

You mean XQTY0086?

Isn't your suggested query illustrating exactly the situation that XQTY0086 describes? It seems to me this should be an error under all implementations.
Comment 2 Oliver Hallam 2009-08-10 09:40:54 UTC
I did indeed mean XQTY0086.

You are indeed right in saying that the error is raised in this case.

I also meant to have the following copy-namespaces declaration in the example

declare copy-namespaces preserve,inherit;


Since copy-namespaces mode is preserve, and I am copying an element, the XQTY0086 error is not raised according to the rules defined in XQ.E1, and my description above is as it stands.

If the example is changed to copy an element with type "union" whose value is "bar:test" then this example becomes even more problematic.  The type of the value of the node in the source document is myString, and so is certainly not namespace sensitive.  However in the context of the result document its value is a QName if only the schema type is preserved.  XQTY0086 as it is currently defined certainly does not apply to namespace-insensitive values.


I can not see a simple way to rewrite the conditions for XQTY0086, without being too restrictive (only allowing copy-namespaces mode preserve no-inherit, for nodes with schema types which can contain values that are namespace-sensitive when using construction mode preserve)


Should this bug be raised against the spec instead, or does this example indicate a flaw in this document representation?
Comment 3 Michael Kay 2009-09-22 20:11:21 UTC
(ACTION A-407-04)

Please ignore my previous comments. I don't think I studied the example carefully enough. I now see what you are getting at.

The thinking behind XQTY0086 was to ensure that when you copy namespace-sensitive content, you are obliged to retain all the in-scope namespaces, to ensure that the typed value remains valid. But your example demonstrates that we don't prevent the node acquiring new in-scope namespaces as a result of the copy, and although the new namespaces cannot make the value invalid, they can cause revalidation to produce a different typed value. Exactly the same problems apply to the equivalent XSLT error XTTE0950.

So, yes, I think it's a spec problem. I don't think it's realistic to prevent the fragment acquiring new in-scope namespaces on a copy, so I think we have to come up with a rule that explains what happens.

We currently say in Data Model 3.3.1.3 "However, implementations are allowed some flexibility in how these properties are stored. An implementation may choose to store the string-value only and derive the typed-value from it, or to store the typed-value only and derive the string-value from it, or to store both the string-value and the typed-value."

We already know of cases where there are observable differences between products based on which of these strategies they adopt. Immediately after the quoted paragraph we give the example 

<offset xsi:type="xs:integer">0030</offset> 

and point out that the string value of this element may be returned as either "30" or "0030". 

We also have a rule in the DM spec that says the relationship between the string value and the typed value must be "consistent with schema validation": "The relationship between the type-name, typed-value, and string-value of an element|attribute node is consistent with XML Schema validation". This implies that the typed value must not be different from the value you would get if you revalidated. However, this rule fails to take into account the fact that the relationship between string value and typed value is also a function of the namespace context.

I think there are two options.

OPTION A. We respect the principle "implementations are allowed some flexibility in how these properties are stored", and recognize that this has the consequence that in cases where validation is namespace-sensitive, this may weaken the principle that the typed value is always the same as you would get by revalidating the string value against the type-name (also leading to differences between products in edge cases, though less severe than the differences we already tolerate).

OPTION B. We respect the principle "the relationship between the string value and the typed value is consistent with schema validation", with the consequence that where the typed value is namespace sensitive, implementations may need to recompute the typed value when data is copied and the namespace context changes.

Because this case is likely to be so rare in practice, my recommendation is to choose option A, which probably means that no implementation needs to change to accommodate this case. I think we can achieve this by adding a description of this case to section 3.3.1.3 in DM, and perhaps a caveat where we describe the typed value of element and attribute nodes as being "consistent with schema validation".
Comment 4 Michael Kay 2009-10-30 14:34:52 UTC
Reclassify this as a bug against the Data Model specification
Comment 5 Norman Walsh 2010-02-18 17:02:12 UTC
At the 18 Feb 2009 telcon we resolved to adopt option A.