Copyright © 2007 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document is an editors' copy that has no official standing.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is a ↑member-only review version which will in due course become a↑ Public Working Draft of XML Schema 1.1: Datatypes. It ↑has no formal standing within W3C; it↑ is here made available for review by W3C members↓ and the public↓. This version of this document was created on 19 September 2007.↑ It reflects (unless otherwise noted elsewhere) all decisions on this document made by the Working Group through 14 September 2007. The document thus incorporates all decisions made by the Working Group to date. ↑
For those primarily interested in the changes since version 1.0, the Changes since version 1.0 (§H) appendix, which summarizes both changes already made and also those in prospect, with links to the relevant sections of this draft, is the recommended starting point. An accompanying version of this document displays in color all changes to normative text since version 1.0; another shows changes since the previous Working Draft.
Comments on this document should be made in W3C's public installation of Bugzilla, specifying "XML Schema" as the product. Instructions can be found at http://www.w3.org/XML/2006/01/public-bugzilla. If access to Bugzilla is not feasible, please send your comments to the W3C XML Schema comments mailing list, www-xml-schema-comments@w3.org (archive) Each Bugzilla entry and email message should contain only one comment.
The end of the Last Call review period is 8 November 2007; comments received after that date will be considered if time allows, but no guarantees can be offered.
Although feedback based on any aspect of this specification is welcome, there are certain aspects of the design presented herein for which the Working Group is particularly interested in feedback. These are designated 'priority feedback' aspects of the design, and identified as such in editorial notes at appropriate points in this draft.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document has been produced by the W3C XML Schema Working Group as part of the W3C XML Activity. The goals of the XML Schema language version 1.1 are discussed in the Requirements for XML Schema 1.1 document. The authors of this document are the members of the XML Schema Working Group. Different parts of this specification have different editors.
This document was produced under the 5 February 2004 W3C Patent Policy. The Working Group maintains a public list of patent disclosures made in connection with this document; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification must disclose the information in accordance with section 6 of the W3C Patent Policy.
The English version of this specification is the only normative version. Information about translations of this document is available at http://www.w3.org/2003/03/Translations/byTechnology?technology=xmlschema.
The presentation of this document has been augmented to identify
changes from a previous version, controlled by
dg-b3228.xml
, which shows the status-quo text without
adornment. Three kinds of changes are highlighted: ↑new, added
text↑, ↑changed
text↓, and ↓deleted
text↓.
This section describes the conceptual framework behind the datatype system defined in this specification. The framework has been influenced by the [ISO 11404] standard on language-independent datatypes as well as the datatypes for [SQL] and for programming languages such as Java.
The datatypes discussed in this specification are for the most part well known abstract concepts such as integer and date. It is not the place of this specification to thoroughly define these abstract concepts; many other publications provide excellent definitions. However, this specification will attempt to describe the abstract concepts well enough that they can be readily recognized and distinguished from other abstractions with which they may be confused.
It is useful to categorize the datatypes defined in this specification along various dimensions, defining terms which can be used to characterize datatypes and the Simple Type Definitions which define them.
First, we distinguish ·atomic·, ·list·, and ·union· datatypes.
For example, a single token which ·matches· Nmtoken from [XML] is in the value space of the ·atomic· datatype NMTOKEN, while a sequence of such tokens is in the value space of the ·list· datatype NMTOKENS.
Union types may be defined in either of two ways. When a union type is ·constructed· by ·union·, its ·value space·, ·lexical space·, and ·lexical mapping· are the "ordered unions" of the ·value spaces·, ·lexical spaces·, and ·lexical mappings· of its ·member types·. When a union type is defined by ·restricting· another ·union·, its ·value space·, ·lexical space·, and ·lexical mapping· are subsets of the ·value spaces·, ·lexical spaces·, and ·lexical mappings· of its ·base type·. ·Union· datatypes are always ·constructed· from other datatypes; they are never ·primitive·. Currently, there are no ·built-in· ·union· datatypes.
<attributeGroup name="occurs"> <attribute name="minOccurs" type="nonNegativeInteger" use="optional" default="1"/> <attribute name="maxOccurs"use="optional" default="1"> <simpleType> <union> <simpleType> <restriction base='nonNegativeInteger'/> </simpleType> <simpleType> <restriction base='string'> <enumeration value='unbounded'/> </restriction> </simpleType> </union> </simpleType> </attribute> </attributeGroup>
Any number ↓(greater than 0)↓↓(zero or more)↓ of ordinary or ·primitive· ·datatypes· can participate in a ·union· type.
[Definition:] The datatypes that participate in the definition of a ·union· datatype are known as the member types of that ·union· datatype.
[Definition:] The transitive membership of a ·union· is the set of its own ·member types·, and the ·member types· of its members, and so on. More formally, if U is a ·union·, then (a) its ·member types· are in the transitive membership of U, and (b) for any datatypes T1 and T2, if T1 is in the transitive membership of U and T2 is one of the ·member types· of T1, then T2 is also in the transitive membership of U.
[Definition:] Those members of the ·transitive membership· of a ·union· datatype U which are themselves not ·union· datatypes are the basic members of U.
[Definition:] If a datatype M is in the ·transitive membership· of a ·union· datatype U, but not one of U's ·member types·, then a sequence of one or more ·union· datatypes necessarily exists, such that the first is one of the ·member types· if U, each is one of the ·member types· of its predecessor in the sequence, and M is one of the ·member types· of the last in the sequence. The ·union· datatypes in this sequence are said to intervene between M and U. When U and M are given by the context, the datatypes in the sequence are referred to as the intervening unions. When M is one of the ·member types· of U, the set of intervening unions is the empty set.
[Definition:] In a valid instance of any ·union·, the first of its members in order which accepts the instance as valid is the active member type. [Definition:] If the ·active member type· is itself a ·union·, one of its members will be its ·active member type·, and so on, until finally a ·basic (non-union) member· is reached. That ·basic member· is the active basic member of the union.
The order in which the ·member
types· are specified in the
definition (that is, in the case of datatypes defined in a schema
document, the order of the <simpleType> children of the
<union> element, or the order of the QNames in the memberTypes
attribute) is
significant. During validation, an element or attribute's value is
validated against the ·member
types· in the order in which
they appear in the definition until a match is found. The
evaluation order can be overridden with the use of xsi:type.
<xsd:element name='size'> <xsd:simpleType> <xsd:union> <xsd:simpleType> <xsd:restriction base='integer'/> </xsd:simpleType> <xsd:simpleType> <xsd:restriction base='string'/> </xsd:simpleType> </xsd:union> </xsd:simpleType> </xsd:element>
<size>1</size> <size>large</size> <size xsi:type='xsd:string'>1</size>
The ·canonical mapping· of a ·union· datatype maps each value onto the ·canonical representation· of that value obtained using the ·canonical mapping· of the first ·member type· in whose value space it lies.
...
Simple Type Definitions provide for:
The Simple Type Definition schema component has the following properties:
Either an Attribute Declaration, an Element Declaration, a Complex Type Definition or a Simple Type Definition.
With one exception, the {base type definition} of any Simple Type Definition is a Simple Type Definition. The exception is ·anySimpleType·, which has anyType, a Complex Type Definition, as its {base type definition}.
If not absent, must be a ·primitive· built-in definition.
↓Must not be empty↓↑Must be present (but may be empty)↑ if {variety} is union, otherwise must be absent.
Simple type definitions are identified by their {name} and {target namespace}. Except for anonymous Simple Type Definitions (those with no {name}), Simple Type Definitions must be uniquely identified within a schema. Within a valid schema, each Simple Type Definition uniquely determines one datatype. The ·value space·, ·lexical space·, ·lexical mapping·, etc., of a Simple Type Definition are the ·value space·, ·lexical space·, etc., of the datatype uniquely determined (or "defined") by that Simple Type Definition.
If {variety} is ·atomic· then the ·value space· of the datatype defined will be a subset of the ·value space· of {base type definition} (which is a subset of the ·value space· of {primitive type definition}). If {variety} is ·list· then the ·value space· of the datatype defined will be the set of finite-length sequences of values from the ·value space· of {item type definition}. If {variety} is ·union· then the ·value space· of the datatype defined will be a subset (possibly an improper subset) of the union of the ·value spaces· of each Simple Type Definition in {member type definitions}.
If {variety} is ·atomic· then the {variety} of {base type definition} must be ·atomic·, unless the {base type definition} is anySimpleType. If {variety} is ·list· then the {variety} of {item type definition} must be either ·atomic· or ·union·, and if {item type definition} is ·union· then all its ·basic members· must be ·atomic·. If {variety} is ·union· then {member type definitions} must be a list of Simple Type Definitions.
The {facets} property determines the ·value space· and ·lexical space· of the datatype being defined by imposing constraints which must be satisfied by values and ·lexical representations·.
The {fundamental facets} property provides some basic information about the datatype being defined: its cardinality, whether an ordering is defined for it by this specification, whether it has upper and lower bounds, and whether it is numeric.
If {final} is the empty set then the type can be used in deriving other types; the explicit values restriction, list and union prevent further derivations of Simple Type Definitions by ·facet-based restriction·, ·list· and ·union· respectively; the explicit value extension prevents any derivation of Complex Type Definitions by extension.
The {context} property is only relevant for anonymous type definitions, for which its value is the component in which this type definition appears as the value of a property, e.g. {item type definition} or {base type definition}.