24569 2014-02-06 21:34:58 +0000 Least common types and lattices 2014-05-15 14:00:42 +0000 1 1 1 Unclassified XPath / XQuery / XSLT XSLT 3.0 Last Call drafts PC All CLOSED FIXED P2 normal --- 24568 1 cmsmcq mike public-qt-comments oldest_to_newest 99955 0 cmsmcq 2014-02-06 21:34:58 +0000 In section 19.2 [1], XSLT 3.0 says that two items do not necessarily have a unique least common type, because item types form a lattice, not a hierarchy: In some cases the above entries require computation of the least common type of two types T and U. Since item types form a lattice rather than a hierarchy, there may be a set of types V such that T and U are both subtypes of every type in V, and no type in V is unambiguously the "least" common type in the sense that all the others are subtypes of it. In this situation the choice of which type in V to use as the inferred static type is implementation-defined. [1] http://www.w3.org/TR/xslt-30/#determining-static-type I believe that in a lattice, any two points have a unique least upper bound or join -- if they don't, one is not dealing with a lattice. The claim that itemtypes form a lattice is also found in XDM [2]. [2] http://www.w3.org/TR/xpath-datamodel-30/#types-hierarchy If any two types do in fact have a least common type, then the paragraph quoted above from section 19.2 should probably go away; if we do have pairs of types that lack a unique least common type, then the claim that our types form a lattice should be dropped from both specs and any other spec that makes it. 99964 1 mike 2014-02-07 00:41:58 +0000 I have to confess that I was quite unaware of this mathematical definition of a lattice, as will be evident to any informed reader of this paragraph! I'm having trouble reconstructing the thinking that led me to believe that the lowest common supertype of two types was not unique. Certainly, I'm having trouble finding a counter-example. I may have simply been assuming that there was no unique least common supertype on the basis that I could see no proof that there was. Perhaps we need to distinguish the set of types that are actually defined in the in-scope schema definitions from the set of types that the type system permits to be defined. This only really makes a difference for unions, where we can easily construct a counter-example if we restrict ourselves to named union types: if union(A, B, C) and union(A, B, D) both "exist" and union(A, B) does not "exist", then union(A, B, C) and union(A, B, D) are both common supertypes of A and B, and neither is the lesser. But for the purpose of type inference in section 19.2, I don't think we're constrained to infer a type that is present in the in-scope schema definitions, so this counter-example isn't pertinent. And I can't think of another (though I have no idea how to go about proving its absence). 99972 2 cmsmcq 2014-02-07 02:50:23 +0000 A simple case for which two XSD simple types do not have a least common supertype expressible in XSD: given simple types A and B, the union of A and B is a supertype (as defined by the XDM / XPath substitution rules, and also by XSD substitutability), and so is the union of B and A (which is distinct in XSD, since unions are ordered). For a few minutes today I thought that it might be the case that function items might also lack a least common supertype, but I have come to believe that for any functions involving simple types that do have least common supertypes and greatest common subtypes, there will be a least common supertype. However, when functions are defined on unions, the problem mentioned above will prevent a unique solution. The only caveat here is that the example of union(A,B) and union(B,A) applies to types as they are defined by XSD. Whether it also applies to XSD types as they occur filtered through the XDM spec is currently beyond my ken. See also bug 24568 against XDM. 99986 3 mike 2014-02-07 09:31:19 +0000 The only XSD union types that are item types are "pure union types" as defined in http://www.w3.org/TR/xpath-30/#dt-pure-union-type I believe it is a consequence of the rules in http://www.w3.org/TR/xpath-30/#id-itemtype-subtype that given two union types U = union(A, B) and V = union(B, A), then U and V are "equal" in the type graph, that is, they are subtypes of each other and they have the same supertypes and subtypes, even though they are distinct in the sense that they define different cast operations on strings. 99991 4 mike 2014-02-07 09:48:41 +0000 I just blurted: >then U and V are "equal" in the type graph, that is, they are subtypes of each other which has the implication that the type graph contains a cycle. Is that in itself enough to show that the graph is not a lattice? 99996 5 mike 2014-02-07 10:06:24 +0000 Perhaps there is a line of reasoning as follows. The subset relation defines a lattice (see http://www.proofwiki.org/wiki/Power_Set_is_Complete_Lattice). Every type has a value space S(T), and if T is a subtype of U then S(T) is a subset of S(U) (However, the converse does not follow). This means that if we consider the value spaces defined by our types, then the value spaces define a lattice under the subset relation. The types themselves do not necessarily form a lattice, because two different types may have the same value space. When we do type inference for streamability, we are only interested in the value space, and not in any other properties of the type. So we can regard two types that have the same value space as equivalent for the purpose. If equivalent types in this sense are regarded as a single type, then the types (hopefully) form a lattice. 100498 6 cmsmcq 2014-02-13 11:21:32 +0000 We discussed this at our face-to-face meeting in Prague. We concluded that whether we have a type lattice depends on exactly what we wish to regard as a "type" for purposes of this analysis. One solution would be to specify that the points in the lattice are groups of types equivalent in the sense that subtype-itemtype(A,B) and subtype-itemtype(B,A). Another would be to ignore whether itemtypes are or are not a lattice, and simply define an ad hoc type inferencing scheme for use here. We are interested primarily in two bits of information: (a) can there be children? (b) can members of the set be numeric? It might be possible to capture this with a very simple type hierarchy, or a simple 2x2 diagram. A 2x2 matrix could give us better information. For example it would allow us to infer that the static type of (element(), xs:string) is non-numeric, whereas a type hierarchy or lattice that looks anything like what's described in our specs must end up with a least common supertype of item() or something similar. We concluded that this issue needs further study. 100576 7 mike 2014-02-14 07:49:51 +0000 One way forward on this might be to say that the inferred type is always a subset of the set of disjoint types {decimal, double, float, boolean, string, dateTime, ..., document-node(), element(), attribute(), text(), ..., function(*)}. The "lowest common supertype" of two types is then simply the union of the two sets. An expression is then "possibly numeric" if it includes decimal, double, or float in the subset, and it is "possibly parental" if it includes document-node() or element() in the subset. 103683 8 cmsmcq 2014-04-10 16:28:16 +0000 See also some further discussion in the thread beginning at https://lists.w3.org/Archives/Member/w3c-xsl-wg/2014Apr/0019.html (member-only link). 104354 9 mike 2014-04-24 14:45:06 +0000 Following email and telcon discussion, section 19.2 has been recast to use the new concspt of U-types, which can be cleanly composed under union, intersection, and difference.