This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 11713 - [XPath 3.0] Rules for union types
Summary: [XPath 3.0] Rules for union types
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XPath 3.0 (show other bugs)
Version: Working drafts
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Jonathan Robie
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-01-10 12:38 UTC by Michael Kay
Modified: 2013-06-19 08:19 UTC (History)
0 users

See Also:


Attachments

Description Michael Kay 2011-01-10 12:38:31 UTC
We currently say in 2.5.4.2

If the expanded QName of an AtomicOrUnionType is not defined as an atomic type or a union type in the in-scope schema types, a static error is raised [err:XPST0051].

We should restrict this so that the only union types allowed are what I will call "plain union types". A plain union type is one whose base type is xs:anySimpleType (that is, it is defined directly as the union of a number of atomic types, as opposed to a union type that is derived by restriction from another union type).

The reason for this rule is that no atomic value will ever be an instance of a non-plain union type; it is pointless to allow such types to be used, for example in a function signature as the type could never be satisfied.

We can then change this Note:

<old>
Note:

The current (second) edition of XML Schema 1.0 contains an error in respect of the substitutability of a union type by one of its members: it fails to recognize that this is unsafe if the union is derived by restriction from another union. This problem is fixed in the current working draft of XML Schema 1.1, and implementers are advised to adopt the solution given there. It is likely that this specification will be updated to refer normatively to XML Schema 1.1 when that specification reaches Recommendation status.
</old>

to read:

<new>
Note:

The current (second) edition of XML Schema 1.0 contains an error in respect of the substitutability of a union type by one of its members: it fails to recognize that this is unsafe if the union is derived by restriction from another union. This problem is fixed in XSD 1.1, but the effect of the resolution is that an atomic value labeled with an atomic type cannot be treated as being substitutable for a union type without explicit validation. This specification therefore allows union types to be used as item types only if they are defined directly as the union of a number of atomic types.
</new>
Comment 1 Michael Kay 2011-01-10 15:25:01 UTC
Further study of the current XQuery 3.0 draft shows many places where the term "atomic type" should be generalized since union types are now allowed in many places where previously atomic types were required.

I think this will be easier if we introduce the definition:

<new>
A *plain* type is an atomic type, or a union type whose base type is xs:anySimpleType and whose member types are all plain types.

The significance of this definition is that this describes the set of types that can have atomic values as their instances. </new>

I would suggest putting it in the initial section of 2.5.

Then we need to change "atomic type" (or in some cases "atomic or union type" to "plain type" in the following places:

2.5 -> "Plain types represent the intersection between the categories of sequence type and schema type. A plain type, such as xs:integer or my:hatsize, is both a sequence type and a schema type."

2.5.2 "Instead, if the type annotation of a node is a list type (such as xs:IDREFS), its typed value is treated as a sequence of atomic values belonging to the itemType of the list (or where this is a union, to one of the atomic member types of the union)". 

2.5.3: "Apart from the item type item(), which permits any kind of item, item types divide into node types (such as element()), plain types (such as xs:integer) and function types (such as function() as item()*)."

2.5.4.2 "The names of list [was non-atomic] types such as xs:IDREFS are not accepted in this context, but can often be replaced by an atomic type with an occurrence indicator, such as xs:IDREF+."

3.1.1. Add "Constructor functions are available for all plain types including union types. For example if my:dt is a user-defined union type whose member types are xs:date, xs:time, and xs:dateTime, then the expression my:dt("2011-01-10") creates an atomic value of type xs:date. The rules follow XML Schema validation rules for union types: the effect is to choose the first member type that accepts the given string in its lexical space."

3.1.5.2 "If the expected type is a sequence of a plain type..."; "Each item in the atomic sequence that is of type xs:untypedAtomic is cast to the expected atomic type". (But for numeric/URI promotion, keep "atomic type".

3.15.3 "The target type must be a plain type that is in the in-scope schema types [err:XPST0051]."

3.15.3 Add a new para 4f "casting from a string or untypedAtomic value is supported if the target type is a plain union type, that is, a union type that imposes no restrictions other than the restrictions imposed by its member types. The semantics of casting follow the XSD rules for validation against a union type; the result of the cast is an atomic value whose type annotation corresponds to the first atomic member type of the union that has the supplied value in its lexical space"

3.15.4. "The target type must be a plain type that is in the in-scope schema types [err:XPST0051]. In addition, the target type cannot be xs:NOTATION or xs:anyAtomicType [err:XPST0080]."

3.15.5. "For every plain type in the in-scope schema types..."

4.11 "For each user-defined plain type in the schema..."

App C.1, row "function signatures", "atomic" -> "plain"

App C.2, row "function implementations", "atomic" -> "plain"
Comment 2 Michael Kay 2011-02-01 16:53:07 UTC
The WG agreed with the general direction given in 

http://lists.w3.org/Archives/Member/w3c-xsl-query/2011Jan/0240.html

(member-only), subject to approval of detailed text, and to review by the Schema WG. It would be useful if any new terminology ("plain types") is common to QT and XSD.
Comment 3 Michael Kay 2011-03-17 17:51:09 UTC
Further discussion on the Schema WG mailing list established that this approach has problems. For example, if assertions are used to restrict a union type, the assertion on the union does not necessarily achieve the same thing as assertions on each of the member types. For example, given a union of (xs:integer, xs:string), and the assertion ($value instance of xs:string), and the instance value 1234, an assertion a the union level will cause the value to be rejected as invalid, while moving the assertion down to the member types will cause it to e rejected as an xs:integer but accepted as an xs:string. 

Another approach has been suggested: when a node has been validated against a union type, the typed value should be an atomic value that is annotated with both the atomic member type and the union type against which it was validated (and any intermediate unions as well). The value is a known to be a valid instance of each of these types, and it is therefore accepted by a function that requires any of these types. For example, if DT-with-Z restricts DT by requiring a timezone, and DT is union(date, time, dateTime), and an attribute A is validated against an attribute declaration with required type DT-with-Z, then the node will be annotated as being of type DT-with-Z, and its typed value will be labelled both with an atomic type (e.g. xs:date) and with the union types DT and DT-with-Z. (The label DT is redundant, but does no harm.) And therefore the atomic value can be used where the required type is xs:date, where the required type is DT, and where the required type is DT-with-Z.

This still isn't perfect. Will current-date() be acceptable where the expected type is DT? (I think it needs to be, for substitutability reasons). Will it be acceptable where the expected type is DT with Z? Probably not. So there's still a difference in treatment between "pure" unions and unions derived by restriction.

I'm inclined to revert to my original proposal: we follow XSD 1.1 by saying that atomic values are substitutable for a "pure" union type but not for a "restricted" union type.
Comment 4 Jonathan Robie 2011-05-02 20:35:37 UTC
I believe the best resolution of this bug is to accept the solution in the original description.
Comment 5 Jonathan Robie 2011-05-02 20:38:05 UTC
(In reply to comment #1)

I agree this needs to be fixed. I'm not happy with the term "plain".

If I understand correctly, this new category of type includes all simple types except list types. True?
Comment 6 Michael Kay 2011-05-02 23:24:35 UTC
Re comment #5: yes. Note that comment #1 implements the suggestion in comment #0 in more detail.
Comment 7 Michael Kay 2011-05-02 23:25:33 UTC
>If I understand correctly, this new category of type includes all simple types
except list types. True?

False. It also excludes union types that are derived by restriction from other union types.
Comment 8 Michael Kay 2011-05-03 08:27:26 UTC
>If I understand correctly, this new category of type includes all simple types
except list types. True?

False. It also excludes union types that are derived by restriction from other
union types.

And it also excludes unions whose member types are lists.

Perhaps we need a definition like this: a /generalized atomic type/ is either an atomic type, or a union type whose member types are all /generalized atomic types/. The instances of a generalized atomic type are atomic values. The atomic member types of a generalized atomic type are (a) if it is an atomic type, then that type, (b) if it is a union, then the atomic types in its transitive membership. If A and B are generalized atomic types, then derives-from(A, B) is true if for every type T among the member types of A there exists a type U among the member types of B such that derives-from(T, U).

(My /generalized atomic type/ here is the same as /plain type/ in the previous proposal. I'm just experimenting with different terms to see what reads best).
Comment 9 Jonathan Robie 2011-05-03 17:09:58 UTC
In today's telcon, we agreed to the solution in the original description, modulo minor wording.

We also agreed that comment #1 shows errors that need to be fixed. At editorial discretion, it may be done in one of the following ways:

1. Introduce a name for a union type that is directly derived from xs:anySimpleType, and use a phrase like "atomic or XXX union type" throughout.

2. Introduce a name that covers atomic types and union types directly derived from xs:anySimpleType, and use that name throughout.
Comment 10 Michael Kay 2011-05-04 09:08:23 UTC
Re comment #9, "a union type that is directly derived from xs:anySimpleType" isn't actually a precise statement of the subset of union types we want to name, because that subset also includes unions that have list types in their membership. If we use XSD 1.1 terminology, the pertinent set is that containing

(a) all simple types whose {variety} is atomic

(b) all simple types whose {variety} is union, provided they satisfy all the following conditions:

(b.1) the {facets} property of the union type is empty

(b.2) no type in the .transitive membership. of the union type has {variety} list

(b.3) no type in the .transitive membership. of the union type is a type with {variety} union having a non-empty {facets} property

[Explanation: (b) defines the set of union types where every valid instance of any of the .basic member types. is an atomic value that is also a valid instance of the union type.]