Bug 19504 - [QT3TS] instanceof139
[QT3TS] instanceof139
Status: CLOSED FIXED
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XPath 3.0
Working drafts
PC Windows NT
: P2 normal
: ---
Assigned To: Jonathan Robie
Mailing list for public feedback on specs from XSL and XML Query WGs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-10-12 10:32 UTC by Tim Mills
Modified: 2012-12-21 07:58 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Mills 2012-10-12 10:32:01 UTC
The test instanceof139 expects false, but I think it should expect true.

   <test-case name="instanceof139" covers="sequence-type-relationships type-derivation-with-unions">
       <description>Instance of with function types and union types. Note: there is no
                    subtype relationship between union(A,B,C) and union(A,B).</description>
       <created by="Michael Kay" on="2012-10-08"/>
       <environment ref="union-List-defined-Types"/>
       <dependency type="spec" value="XQ30+"/>
       <dependency type="feature" value="higherOrderFunctions"/>
       <test><![CDATA[
         import schema namespace s="http://www.w3.org/XQueryTest/unionListDefined";
         declare function local:f($a as s:approximateDate) as xs:boolean { true() };
         local:f#1 instance of function(s:dateOrDateTime) as xs:boolean
       ]]></test>
       <result>
         <assert-false/>
       </result>
   </test-case>

In short, local:f#1 always returns boolean (thus the return typed match) and can accept a date or dateTime (the union of which is s:dateOrDateTime).  here's the detail.

local:f#1 is an instance of function(s:dateOrDateTime) as boolean if the judgement subtype-itemtype(Ai, Bi) determines that the ItemType Ai is a subtype of the ItemType Bi,  where

Ai = function(s:approximateDate) as boolean
Bi = type function(s:dateOrDateTime) as boolean

Ai is a subtype of Bi if and only if at least one of the following conditions applies:

24. Bi is AnnotationsB function(Ba_1, Ba_2, ... Ba_N) as Br, Ai is AnnotationsA function(Aa_1, Aa_2, ... Aa_M) as Ar, where [AnnotationsB] and [AnnotationsA] are optional lists of one or more annotations; N (arity of Bi) equals M (arity of Ai); subtype(Ar, Br); for values of I between 1 and N, subtype(Ba_I, Aa_I) ; and subtype-assertions(AnnotationsA, AnnotationsB) .


Here 
Aa_1 = s:approximdateDate
Ar = boolean
Ba_1 = s:dateOrDateTime
Br = boolean

Clearly subtype(Ar, Br); = subtype(boolean, boolean) holds.

That leaves, subtype(s:dateOrDateTime, s:approximateDate).  This holds if: 

1.  s:dateOrDate and s:approximateDate are AtomicOrUnionTypes, and derives-from(s:dateOrDate, s:approximateDate) returns true.

Now, 

s:dateOrDateTime = xs:date | xs:dateTime
s:approximateDate = xs:date | xs:dateTime | xs:gYear | xs:gYearMonth

So derives-from(s:dateOrDateTime, s:approximateDate) must hold and thus the expected result must be true.
Comment 1 Michael Kay 2012-10-12 14:32:42 UTC
You say:

<quote>
s:dateOrDateTime = xs:date | xs:dateTime
s:approximateDate = xs:date | xs:dateTime | xs:gYear | xs:gYearMonth

So derives-from(s:dateOrDateTime, s:approximateDate) must hold and thus the expected result must be true.
</quote>

But the essence of this test is to show that although it's obvious to you and me that the set of instances of union(A,B) is a subset of the instances of union(A, B, C, D), there is no subtype derivation relationship between the two: neither is derived from the other. In XSD terms, union(A,B) is not validly derived from union(A,B,C,D).

Whether this is something that could be changed is another matter. I don't think it would be wise for our specs to diverge from the XSD definition here.

In XSD 1.1, there are some parts of the spec where the subsumption test is now essentially "A is a valid restriction of B if the set of valid instances of A is a subset of the valid instances of B". That's the rule used when a complex type is derived by restriction. But there are other parts of the spec, notably subsitutability in substitution groups, where it is necessary to show a derivation relationship between the types; here, for example, if you have one type T restricting string with the facet minLength="3", and another type U restricting string with minLength="4", then U is not substitutable for T, even though its instances are a subset. The same applies to these two union types.
Comment 2 Tim Mills 2012-10-12 15:48:58 UTC
Oh dear.  I feared that was the answer which is why my argument resorted to "clearly" once I got to derives-from.

I have a horrible feeling that this means that there is a mismatch between the static typing and dynamic evaluation rules.

declare variable $p as s:dateOrDateTime external;
declare variable $q as s:approximateDate := $p

$p, $q

is a static type error, even though it can never fail.  Do you agree?
Comment 3 Tim Mills 2012-10-12 16:12:21 UTC
I've just looked back at the formal semantics.

The judgment

statEnv |- Type1 <: Type2

holds if the first type is a subtype of the second.

Semantics

This judgment is true if and only if, for every value Value, if Value matches Type1 holds, then Value matches Type2 also holds.

You say

"it's obvious to you and me that the set of instances of union(A,B) is a subset of the instances of union(A, B, C, D)"

and thus this subtype relationship has to hold.  Hence I think we have a problem.
Comment 4 Michael Kay 2012-10-12 17:24:39 UTC
If we had normative rules about static typing or a normative reference to the formal semantics, then I think you would be right. But we don't.

We do have rules for type subsumption that affect the (dynamic) typing rules for first-class functions (and nothing else, which is why this test uses first class functions). This rule is defined in terms of the "Type Derivation OK" rules in XSD, not in terms of the XQuery formal semantics.
Comment 5 Tim Mills 2012-10-12 18:14:49 UTC
I think we have a significant problem.  Having spent much time on type checking, I'd be prepared to object to transition to CR without sorting this out.
Comment 6 Michael Kay 2012-10-12 21:01:04 UTC
There are plenty of other cases where the extent of a type P is clearly a subset of the extent of another type S, but P is not substitutable for S. For example, the pair (P=xs:unsignedByte, S=xs:int), or the pair (P=string(length<5), S=string(length<10)) Agreed, this situation is undesirable (and in the case of the integer subtypes, makes them fairly unusable), but it's a consequence of the original decision to go with "named typing" and I think we have to live with it.

In fact, as XSD 1.1 part 2 notes (in section 3.16.6.3 Type Derivation OK (Simple)), the fact that A is substitutable for union(A, B, C) is a little anomalous; it's a rare concession towards structural rather than named typing.

I'm not sure that anything would break if we recognized union(A,B) as a subtype of union(A,B,C). I think we have to recognize subtype relationships that XSD recognizes, but the converse is not necessary. But it would certainly create an unnecessary incompatibility.
Comment 7 Tim Mills 2012-10-13 10:24:48 UTC
Since a value's dynamic type will never be a union type, but rather one of the union's member types, using derives-from with AT (actual type) being a union type seems a tad strange.

I think things would hang together better if either the anomaly you pointed out regarding A being substitutable for union(A,B) didn't hold,, or if union(A,B) were substitutable for union(A,B,C).  This halfway house leaves things a little incoherent.
Comment 8 Michael Kay 2012-10-13 15:44:17 UTC
>I think things would hang together better if either the anomaly you pointed out regarding A being substitutable for union(A,B) didn't hold,, or if union(A,B) were substitutable for union(A,B,C).  This halfway house leaves things a little incoherent.

I think it makes most sense for our substitutability rules to be the same as those in XSD. We could allow substitutability in cases where XSD doesn't (as we do for example with substituting float for double), but it complicates things.
Comment 9 Tim Mills 2012-10-13 19:17:11 UTC
(In reply to comment #6)
> There are plenty of other cases where the extent of a type P is clearly a
> subset of the extent of another type S, but P is not substitutable for S.
> For example, the pair (P=xs:unsignedByte, S=xs:int), or the pair
> (P=string(length<5), S=string(length<10)) Agreed, this situation is
> undesirable (and in the case of the integer subtypes, makes them fairly
> unusable), but it's a consequence of the original decision to go with "named
> typing" and I think we have to live with it.

To be clear, the 'matches' judgement does not hold for the sort of pairings you are describing here.  The unsigned byte value B does not match xs:int.  However, it does hold that all values which match union(P,Q) also match union(P,Q,R).

I am increasingly convinced that the decision made for XSD doesn't make a great deal of sense for XQuery.
Comment 10 Michael Kay 2012-10-15 10:03:57 UTC
OK, I agree the case of union(A,B) <: union(A,B,C,D) is different from the other cases I mentioned, because after validating an instance against these two types, the instance carries no record of which of the two types was used for validation.

Although the only place we use subtyping judgements in the standard (namely for higher-order functions) isn't important enough to justify special treatment of this case, I can see that an implementation wanting to use the same judgements for static type checking would want to "get this right", and it's reasonable to want to use the same rules for both cases.

I don't think any harm is done if we change our rules to handle this case, other than the slight additional complexity introduced to the specification by virtue of the fact that we no longer simply defer to the substitutability rules given in XSD.

There's another point which occurs to me suggesting that we need to make this change: namely that in XSD, if a type U is defined as union(union(A,B), union(C,D)), then union(A,B) and union(C,D) are substitutable for U, while union(A,C) is not. So in this situation an element E of type union(A,B) can appear in the substitution group of an element H of type union(A,B,C,D), which means that we must accept it. We aren't obliged to accept union(A,C) but it does no harm if we do.
Comment 11 Tim Mills 2012-10-15 10:19:50 UTC
> Although the only place we use subtyping judgements in the standard (namely
> for higher-order functions) isn't important enough to justify special
> treatment of this case, I can see that an implementation wanting to use the
> same judgements for static type checking would want to "get this right", and
> it's reasonable to want to use the same rules for both cases.

Yes, that the problem I was encountering.
Comment 12 O'Neil Delpratt 2012-11-08 14:36:47 UTC
Making this a spec bug. Tim and Mike seems to have concluded that the Union(A, B) should be a sub-type of Union(A, B, C)
Comment 13 Tim Mills 2012-12-20 14:04:04 UTC
This was resolved in the  XML Query/XSL WG Joint Teleconference #525 Agenda 2012-12-18.

It was decided that union(A, B) is to be considered as a subtype of union(A, B, C).  The test suite has been updated accordingly.
Comment 14 Jonathan Robie 2012-12-20 21:53:22 UTC
I just added this - please take a look and confirm that this solves the problem.

<quote>
The judgement subtype-itemtype(Ai, Bi) determines if the ItemType Ai is a subtype of the ItemType Bi. Ai is a subtype of Bi if and only if at least one of the following conditions applies:

1. Ai and Bi are AtomicOrUnionTypes, and derives-from(Ai, Bi) returns true.

<add>
2. Ai and Bi are both [pure union types], and every type t in the transitive membership of Ai is also in the transitive membership of Bi.
</add>

!!! SNIP !!!
</quote>
Comment 15 Tim Mills 2012-12-21 07:58:25 UTC
Sounds good to me - thanks.