The presentation of this document has been augmented to identify changes from a previous version. Three kinds of changes are highlighted: ↑new, added text↑, ↑changed text↓, and ↓deleted text↓.
Please refer to the errata for this document, which may include some normative corrections.
This document is also available in these non-normative formats: XML, Independent copy of the schema for schema documents, A schema for built-in datatypes only, in a separate namespace, and Independent copy of the DTD for schema documents. See also translations.
Copyright © 2004 Id: datatypes-with-errata.xml,v 1.20 2004/10/07 22:51:40 cmsmcq Exp W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
XML Schema: Datatypes is part 2 of the specification of the XML Schema language. It defines facilities for defining datatypes to be used in XML Schemas as well as other XML specifications. The datatype language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs) for specifying datatypes on elements and attributes.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. ↑A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.↓
↑This is a W3C Recommendation, which forms part of the Second Edition of XML Schema.↑ This document has been reviewed by W3C Members and other interested parties and has been endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited as a normative reference from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
This document has been produced by the W3C XML Schema Working Group as part of the W3C XML Activity. The goals of the XML Schema language are discussed in the XML Schema Requirements document. The authors of this document are the members of the XML Schema Working Group. Different parts of this specification have different editors.
This version of this document incorporates some editorial changes from earlier versions.
This document was produced under the 24 January 2002 Current Patent Practice (CPP) as amended by the W3C Patent Policy Transition Procedure. The Working Group maintains a public list of patent disclosures relevant to this document; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy.
Please report errors in this document to www-xml-schema-comments@w3.org (archive). The list of known errors in this specification is available at http://www.w3.org/2001/05/xmlschema-errata.
The English version of this specification is the only normative version. Information about translations of this document is available at http://www.w3.org/2001/05/xmlschema-translations.
A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR.
This second edition is not a new version, it merely incorporates the changes dictated by the corrections to errors found in the first edition as agreed by the XML Schema Working Group, as a convenience to readers. A separate list of all such corrections is available at http://www.w3.org/2001/05/xmlschema-errata.
The errata list for this second edition is available at http://www.w3.org/2004/03/xmlschema-errata.
Please report errors in this document to www-xml-schema-comments@w3.org (archive).
1 Introduction
1.1 Purpose
1.2 Requirements
1.3 Scope
1.4 Terminology
1.5 Constraints and Contributions
2 Type System
2.1 Datatype
2.2 Value space
2.3 Lexical space
2.4 Facets
2.5 Datatype dichotomies
3 Built-in datatypes
3.1 Namespace considerations
3.2 Primitive datatypes
3.3 Derived datatypes
4 Datatype components
4.1 Simple Type Definition
4.2 Fundamental Facets
4.3 Constraining Facets
5 Conformance
A Schema for Datatype Definitions (normative)
B DTD for Datatype Definitions (non-normative)
C Datatypes and Facets
C.1 Fundamental Facets
D ISO 8601 Date and Time Formats
D.1 ISO 8601 Conventions
D.2 Truncated and Reduced Formats
D.3 Deviations from ISO 8601 Formats
E Adding durations to dateTimes
E.1 Algorithm
E.2 Commutativity and Associativity
F Regular Expressions
F.1 Character Classes
G Glossary (non-normative)
H References
H.1 Normative
H.2 Non-normative
I Acknowledgements (non-normative)
The [XML 1.0 (Second Edition)] specification defines limited facilities for applying datatypes to document content in that documents may contain or refer to DTDs that assign types to elements and attributes. However, document authors, including authors of traditional documents and those transporting data in XML, often require a higher degree of type checking to ensure robustness in document understanding and data interchange.
The table below offers two typical examples of XML instances in which datatypes are implicit: the instance on the left represents a billing invoice, the instance on the right a memo or perhaps an email message in XML.
| Data oriented | Document oriented |
|---|---|
<invoice> <orderDate>1999-01-21</orderDate> <shipDate>1999-01-25</shipDate> <billingAddress> <name>Ashok Malhotra</name> <street>123 Microsoft Ave.</street> <city>Hawthorne</city> <state>NY</state> <zip>10532-0000</zip> </billingAddress> <voice>555-1234</voice> <fax>555-4321</fax> </invoice> |
<memo importance='high'
date='1999-03-23'>
<from>Paul V. Biron</from>
<to>Ashok Malhotra</to>
<subject>Latest draft</subject>
<body>
We need to discuss the latest
draft <emph>immediately</emph>.
Either email me at <email>
mailto:paul.v.biron@kp.org</email>
or call <phone>555-9876</phone>
</body>
</memo>
|
The invoice contains several dates and telephone numbers, the postal abbreviation for a state (which comes from an enumerated list of sanctioned values), and a ZIP code (which takes a definable regular form). The memo contains many of the same types of information: a date, telephone number, email address and an "importance" value (from an enumerated list, such as "low", "medium" or "high"). Applications which process invoices and memos need to raise exceptions if something that was supposed to be a date or telephone number does not conform to the rules for valid dates or telephone numbers.
In both cases, validity constraints exist on the content of the instances that are not expressible in XML DTDs. The limited datatyping facilities in XML have prevented validating XML processors from supplying the rigorous type checking required in these situations. The result has been that individual applications writers have had to implement type checking in an ad hoc manner. This specification addresses the need of both document authors and applications writers for a robust, extensible datatype system for XML which could be incorporated into XML processors. As discussed below, these datatypes could be used in other XML-related standards as well.
The [XML Schema Requirements] document spells out concrete requirements to be fulfilled by this specification, which state that the XML Schema Language must:
This portion of the XML Schema Language discusses datatypes that can be used in an XML Schema. These datatypes can be specified for element content that would be specified as #PCDATA and attribute values of various types in a DTD. It is the intention of this specification that it be usable outside of the context of XML Schemas for a wide range of other XML-related activities such as [XSL] and [RDF Schema].
The terminology used to describe XML Schema Datatypes is defined in the body of this specification. The terms defined in the following list are used in building those definitions and in describing the actions of a datatype processor:
This specification provides three different kinds of normative statements about schema components, their representations in XML and their contribution to the schema-validation of information items:
This section describes the conceptual framework behind the type system defined in this specification. The framework has been influenced by the [ISO 11404] standard on language-independent datatypes as well as the datatypes for [SQL] and for programming languages such as Java.
The datatypes discussed in this specification are computer representations of well known abstract concepts such as integer and date. It is not the place of this specification to define these abstract concepts; many other publications provide excellent definitions.
[Definition:] In this specification, a datatype is a 3-tuple, consisting of a) a set of distinct values, called its ·value space·, b) a set of lexical representations, called its ·lexical space·, and c) a set of ·facet·s that characterize properties of the ·value space·, individual values or lexical items.
[Definition:] A value space is the set of values for a given datatype. Each value in the value space of a datatype is denoted by one or more literals in its ·lexical space·.
The ·value space· of a given datatype can be defined in one of the following ways:
·value space·s have certain properties. For example, they always have the property of ·cardinality·, some definition of equality and might be ·ordered·, by which individual values within the ·value space· can be compared to one another. The properties of ·value space·s that are recognized by this specification are defined in Fundamental facets (§2.4.1).
In addition to its ·value space·, each datatype also has a lexical space.
[Definition:] A lexical space is the set of valid literals for a datatype.
For example, "100" and "1.0E2" are two different literals from the ·lexical space· of float which both denote the same value. The type system defined in this specification provides a mechanism for schema designers to control the set of values and the corresponding set of acceptable literals of those values for a datatype.
While the datatypes defined in this specification have, for the most part, a single lexical representation i.e. each value in the datatype's ·value space· is denoted by a single literal in its ·lexical space·, this is not always the case. The example in the previous section showed two literals for the datatype float which denote the same value. Similarly, there ·may· be several literals for one of the date or time datatypes that denote the same value using different timezone indicators.
[Definition:] A canonical lexical representation is a set of literals from among the valid set of literals for a datatype such that there is a one-to-one mapping between literals in the canonical lexical representation and values in the ·value space·.
[Definition:] A facet is a single defining aspect of a ·value space·. Generally speaking, each facet characterizes a ·value space· along independent axes or dimensions.
The facets of a datatype serve to distinguish those aspects of one datatype which differ from other datatypes. Rather than being defined solely in terms of a prose description the datatypes in this specification are defined in terms of the synthesis of facet values which together determine the ·value space· and properties of the datatype.
Facets are of two types: fundamental facets that define the datatype and non-fundamental or constraining facets that constrain the permitted values of a datatype.
[Definition:] A fundamental facet is an abstract property which serves to semantically characterize the values in a ·value space·.
All fundamental facets are fully described in Fundamental Facets (§4.2).
[Definition:] A constraining facet is an optional property that can be applied to a datatype to constrain its ·value space·.
Constraining the ·value space· consequently constrains the ·lexical space·. Adding ·constraining facet·s to a ·base type· is described in Derivation by restriction (§4.1.2.1).
All constraining facets are fully described in Constraining Facets (§4.3).
It is useful to categorize the datatypes defined in this specification along various dimensions, forming a set of characterization dichotomies.
The first distinction to be made is that between ·atomic·, ·list· and ·union· datatypes.
For example, a single token which ·match·es Nmtoken from [XML 1.0 (Second Edition)] could be the value of an ·atomic· datatype (NMTOKEN); while a sequence of such tokens could be the value of a ·list· datatype (NMTOKENS).
·atomic· datatypes can be either ·primitive· or ·derived·. The ·value space· of an ·atomic· datatype is a set of "atomic" values, which for the purposes of this specification, are not further decomposable. The ·lexical space· of an ·atomic· datatype is a set of literals whose internal structure is specific to the datatype in question.
Several type systems (such as the one described in [ISO 11404]) treat ·list· datatypes as special cases of the more general notions of aggregate or collection datatypes.
·list· datatypes are always ·derived·. The ·value space· of a ·list· datatype is a set of finite-length sequences of ·atomic· values. The ·lexical space· of a ·list· datatype is a set of literals whose internal structure is a ↓white ↓space↑-↑separated sequence of literals of the ·atomic· datatype of the items in the ·list·↓ (where whitespace ·match·es S in [XML 1.0 (Second Edition)])↓.
[Definition:] The ·atomic· ↑or ·union·↑ datatype that participates in the definition of a ·list· datatype is known as the itemType of that ·list· datatype.
<simpleType name='sizes'> <list itemType='decimal'/> </simpleType>
<cerealSizes xsi:type='sizes'> 8 10.5 12 </cerealSizes>
A ·list· datatype can be ·derived· from an ·atomic· datatype whose ·lexical space· allows ↓white↓space (such as string or anyURI)↑or a ·union· datatype any of whose {member type definitions}'s ·lexical space· allows ↓white↓space↑. In such a case, regardless of the input, list items will be separated at ↓white↓space boundaries.
<simpleType name='listOfString'> <list itemType='string'/> </simpleType>
<someElement xsi:type='listOfString'> this is not list item 1 this is not list item 2 this is not list item 3 </someElement>
When a datatype is ·derived· from a ·list· datatype, the following ·constraining facet·s apply:
For each of ·length·, ·maxLength· and ·minLength·, the unit of length is measured in number of list items. The value of ·whiteSpace· is fixed to the value collapse.
↑ For ·list· datatypes the ·lexical space· ↓(and hence, the ·value space·)↓ is composed of ↓white↓space↑-↑separated literals of its ·itemType·. Hence, any ·pattern· specified when a new datatype is ·derived· from a ·list· datatype is matched against each literal of the ·list· datatype and not against the literals of the datatype that serves as its ·itemType·. ↑
<xs:simpleType name='myList'> <xs:list itemType='xs:integer'/> </xs:simpleType> <xs:simpleType name='myRestrictedList'> <xs:restriction base='myList'> <xs:pattern value='123 (\d+\s)*456'/> </xs:restriction> </xs:simpleType> <someElement xsi:type='myRestrictedList'>123 456</someElement> <someElement xsi:type='myRestrictedList'>123 987 456</someElement> <someElement xsi:type='myRestrictedList'>123 987 567 456</someElement>
The canonical-lexical-representation for the ·list· datatype is defined as the lexical form in which each item in the ·list· has the canonical lexical representation of its ·itemType·.
The ·value space· and ·lexical space· of a ·union· datatype are the union of the ·value space·s and ·lexical space·s of its ·memberTypes·. ·union· datatypes are always ·derived·. Currently, there are no ·built-in· ·union· datatypes.
<attributeGroup name="occurs">
<attribute name="minOccurs" type="nonNegativeInteger"
↑use="optional"↑ default="1"/>
<attribute name="maxOccurs"↑use="optional" default="1"↑>
<simpleType>
<union>
<simpleType>
<restriction base='nonNegativeInteger'/>
</simpleType>
<simpleType>
<restriction base='string'>
<enumeration value='unbounded'/>
</restriction>
</simpleType>
</union>
</simpleType>
</attribute>
</attributeGroup>
Any number (greater than 1) of ·atomic· or ·list· ·datatype·s can participate in a ·union· type.
[Definition:] The datatypes that participate in the definition of a ·union· datatype are known as the memberTypes of that ·union· datatype.
The order in which the ·memberTypes· are specified in the definition (that is, the order of the <simpleType> children of the <union> element, or the order of the QNames in the memberTypes attribute) is significant. During validation, an element or attribute's value is validated against the ·memberTypes· in the order in which they appear in the definition until a match is found. The evaluation order can be overridden with the use of xsi:type.
<xsd:element name='size'>
<xsd:simpleType>
<xsd:union>
<xsd:simpleType>
<xsd:restriction base='integer'/>
</xsd:simpleType>
<xsd:simpleType>
<xsd:restriction base='string'/>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType>
</xsd:element>
<size>1</size> <size>large</size> <size xsi:type='xsd:string'>1</size>
The canonical-lexical-representation for a ·union· datatype is defined as the lexical form in which the values have the canonical lexical representation of the appropriate ·memberTypes·.
Next, we distinguish between ·primitive· and ·derived· datatypes.
For example, in this specification, float is a well-defined mathematical concept that cannot be defined in terms of other datatypes, while a integer is a special case of the more general datatype decimal.
[Definition:] ↑ The simple ur-type definition is a special restriction of the ur-type definition whose name is anySimpleType in the XML Schema namespace. anySimpleType can be considered as the ·base type· of all ·primitive· datatypes. anySimpleType is considered to have an unconstrained lexical space and a ·value space· consisting of the union of the ·value space·s of all the ·primitive· datatypes and the set of all lists of all members of the ·value space·s of all the ·primitive· datatypes. ↑ ↓ There exists a conceptual datatype, whose name is anySimpleType, that is the simple version of the ur-type definition from [XML Schema Part 1: Structures]. anySimpleType can be considered as the ·base type· of all ·primitive· types. The ·value space· of anySimpleType can be considered to be the ·union· of the ·value space·s of all ·primitive· datatypes. ↓
The datatypes defined by this specification fall into both the ·primitive· and ·derived· categories. It is felt that a judiciously chosen set of ·primitive· datatypes will serve the widest possible audience by providing a set of convenient datatypes that can be used as is, as well as providing a rich enough base from which the variety of datatypes needed by schema designers can be ·derived·.
In the example above, integer is ·derived· from decimal.
As described in more detail in XML Representation of Simple Type Definition Schema Components (§4.1.2), each ·user-derived· datatype ·must· be defined in terms of another datatype in one of three ways: 1) by assigning ·constraining facet·s which serve to restrict the ·value space· of the ·user-derived· datatype to a subset of that of the ·base type·; 2) by creating a ·list· datatype whose ·value space· consists of finite-length sequences of values of its ·itemType·; or 3) by creating a ·union· datatype whose ·value space· consists of the union of the ↓·value space·↓↑·value space·s of↑ its ·memberTypes·.
[Definition:] A datatype is said to be ·derived· by restriction from another datatype when values for zero or more ·constraining facet·s are specified that serve to constrain its ·value space· and/or its ·lexical space· to a subset of those of its ·base type·.
[Definition:] Every datatype that is ·derived· by restriction is defined in terms of an existing datatype, referred to as its base type. base types can be either ·primitive· or ·derived·.
A ·list· datatype can be ·derived· from another datatype (its ·itemType·) by creating a ·value space· that consists of a finite-length sequence of values of its ·itemType·.
One datatype can be ·derived· from one or more datatypes by ·union·ing their ·value space·s and, consequently, their ·lexical space·s.
Conceptually there is no difference between the ·built-in· ·derived· datatypes included in this specification and the ·user-derived· datatypes which will be created by individual schema designers. The ·built-in· ·derived· datatypes are those which are believed to be so common that if they were not defined in this specification many schema designers would end up "reinventing" them. Furthermore, including these ·derived· datatypes in this specification serves to demonstrate the mechanics and utility of the datatype generation facilities of this specification.

Each built-in datatype in this specification (both ·primitive· and ·derived·) can be uniquely addressed via a URI Reference constructed as follows:
For example, to address the int datatype, the URI is:
http://www.w3.org/2001/XMLSchema#intAdditionally, each facet definition element can be uniquely addressed via a URI constructed as follows:
For example, to address the maxInclusive facet, the URI is:
http://www.w3.org/2001/XMLSchema#maxInclusiveAdditionally, each facet usage in a built-in datatype definition can be uniquely addressed via a URI constructed as follows:
For example, to address the usage of the maxInclusive facet in the definition of int, the URI is:
http://www.w3.org/2001/XMLSchema#int.maxInclusiveThe ·built-in· datatypes defined by this specification are designed to be used with the XML Schema definition language as well as other XML specifications. To facilitate usage within the XML Schema definition language, the ·built-in· datatypes in this specification have the namespace name:
To facilitate usage in specifications other than the XML Schema definition language, such as those that do not want to know anything about aspects of the XML Schema definition language other than the datatypes, each ·built-in· datatype is also defined in the namespace whose URI is:
This applies to both ·built-in· ·primitive· and ·built-in· ·derived· datatypes.
Each ·user-derived· datatype is also associated with a unique namespace. However, ·user-derived· datatypes do not come from the namespace defined by this specification; rather, they come from the namespace of the schema in which they are defined (see XML Representation of Schemas in [XML Schema Part 1: Structures]).
The ·primitive· datatypes defined by this specification are described below. For each datatype, the ·value space· and ·lexical space· are defined, ·constraining facet·s which apply to the datatype are listed and any datatypes ·derived· from this datatype are specified.
·primitive· datatypes can only be added by revisions to this specification.
[Definition:] The string datatype represents character strings in XML. The ·value space· of string is the set of finite-length sequences of characters (as defined in [XML 1.0 (Second Edition)]) that ·match· the Char production from [XML 1.0 (Second Edition)]. A character is an atomic unit of communication; it is not further specified except to note that every character has a corresponding Universal Character Set code point, which is an integer.
string has the following ·constraining facets·:
The following ·built-in· datatypes are ·derived· from string:
[Definition:] boolean has the ·value space· required to support the mathematical concept of binary-valued logic: {true, false}.
An instance of a datatype that is defined as ·boolean· can have the following legal literals {true, false, 1, 0}.
[Definition:] decimal represents ↑a subset of the real numbers, which can be represented by decimal numerals↑↓arbitrary precision decimal numbers↓. The ·value space· of decimal is the set of ↑ numbers that can be obtained by multiplying an integer by a non-positive power of ten, i.e., expressible as i × 10^-n where i and n are integers and n >= 0↑↓the values i × 10^-n, where i and n are integers such that n >= 0↓. ↑Precision is not reflected in this value space; the number 2.0 is not distinct from the number 2.00.↑ The ·order-relation· on decimal is ↑the order relation on real numbers, restricted to this subset↑↓: x < y iff y - x is positive↓.
↓ [Definition:] The ·value space· of types derived from decimal with a value for ·totalDigits· of p is the set of values i × 10^-n, where n and i are integers such that p >= n >= 0 and the number of significant decimal digits in i is less than or equal to p. ↓
↓ [Definition:] The ·value space· of types derived from decimal with a value for ·fractionDigits· of s is the set of values i × 10^-n, where i and n are integers such that 0 <= n <= s. ↓
decimal has a lexical representation
consisting of a finite-length sequence of decimal digits (#x30-#x39) separated
by a period as a decimal indicator.
↓
If ·totalDigits· is
specified, the number of digits must be less than or equal to
·totalDigits·.
If ·fractionDigits· is specified, the
number of digits following the decimal point must be less than or equal to
the ·fractionDigits·.
↓
An optional leading sign is allowed.
If the sign is omitted, "+" is assumed. Leading and trailing zeroes are optional.
If the fractional part is zero, the period and following zero(es) can
be omitted.
For example: -1.23, 12678967.543233, +100000.00, 210.
The canonical representation for decimal is defined by prohibiting certain options from the Lexical representation (§3.2.3.1). Specifically, the preceding optional "+" sign is prohibited. The decimal point is required. Leading and trailing zeroes are prohibited subject to the following: there must be at least one digit to the right and to the left of the decimal point which may be a zero.
decimal has the following ·constraining facets·:
[Definition:] float ↑is patterned after↑↓corresponds to↓ the IEEE single-precision 32-bit floating point type [IEEE 754-1985]. The basic ·value space· of float consists of the values m × 2^e, where m is an integer whose absolute value is less than 2^24, and e is an integer between -149 and 104, inclusive. In addition to the basic ·value space· described above, the ·value space· of float also contains the following ↑three↑ special values: ↓positive and negative zero,↓ positive and negative infinity and not-a-number ↑(NaN)↑. The ·order-relation· on float is: x < y iff y - x is positive ↑for x and y in the value space↑. ↑Positive infinity is greater than all other non-NaN values. NaN equals itself but is ·incomparable· with (neither greater than nor less than) any other value in the ·value space·.↑ ↓ Positive zero is greater than negative zero. Not-a-number equals itself and is greater than all float values including positive infinity.↓
A literal in the ·lexical space· representing a decimal number d maps to the normalized value in the ·value space· of float that is closest to d in the sense defined by [Clinger, WD (1990)]; if d is exactly halfway between two such values then the even value is chosen.
float values have a lexical representation consisting of a mantissa followed, optionally, by the character "E" or "e", followed by an exponent. The exponent ·must· be an integer. The mantissa must be a decimal number. The representations for exponent and mantissa must follow the lexical rules for integer and decimal. If the "E" or "e" and the following exponent are omitted, an exponent value of 0 is assumed.
The special values
↓positive and negative zero,↓
positive
and negative infinity and not-a-number have lexical representations
↓0, -0,↓
INF, -INF and
NaN, respectively.
↑
Lexical representations for zero may take a positive or negative sign.
↑
For example, -1E4, 1267.43233E12, 12.78e-2, 12
↑, -0, 0↑
and INF are all legal literals for float.
The canonical representation for float is defined by prohibiting certain options from the Lexical representation (§3.2.4.1). Specifically, the exponent must be indicated by "E". Leading zeroes and the preceding optional "+" sign are prohibited in the exponent. ↑ If the exponent is zero, it must be indicated by "E0". ↑ For the mantissa, the preceding optional "+" sign is prohibited and the decimal point is required. ↓ For the exponent, the preceding optional "+" sign is prohibited. ↓ Leading and trailing zeroes are prohibited subject to the following: number representations must be normalized such that there is a single digit ↑which is non-zero↑ to the left of the decimal point and at least a single digit to the right of the decimal point ↑ unless the value being represented is zero. The canonical representation for zero is 0.0E0↑.
float has the following ·constraining facets·:
[Definition:] The double datatype ↑is patterned after the↑ ↓corresponds to↓ IEEE double-precision 64-bit floating point type [IEEE 754-1985]. The basic ·value space· of double consists of the values m × 2^e, where m is an integer whose absolute value is less than 2^53, and e is an integer between -1075 and 970, inclusive. In addition to the basic ·value space· described above, the ·value space· of double also contains the following ↑three↑ special values: ↓positive and negative zero,↓ positive and negative infinity and not-a-number ↑(NaN)↑. The ·order-relation· on double is: x < y iff y - x is positive ↑for x and y in the value space↑. ↑Positive infinity is greater than all other non-NaN values. NaN equals itself but is ·incomparable· with (neither greater than nor less than) any other value in the ·value space·.↑ ↓ Positive zero is greater than negative zero. Not-a-number equals itself and is greater than all float values including positive infinity.↓
A literal in the ·lexical space· representing a decimal number d maps to the normalized value in the ·value space· of double that is closest to d; if d is exactly halfway between two such values then the even value is chosen. This is the best approximation of d ([Clinger, WD (1990)], [Gay, DM (1990)]), which is more accurate than the mapping required by [IEEE 754-1985].
double values have a lexical representation consisting of a mantissa followed, optionally, by the character "E" or "e", followed by an exponent. The exponent ·must· be an integer. The mantissa must be a decimal number. The representations for exponent and mantissa must follow the lexical rules for integer and decimal. If the "E" or "e" and the following exponent are omitted, an exponent value of 0 is assumed.
The special values
↓positive and negative zero,↓
positive
and negative infinity and not-a-number have lexical representations
↓0, -0,↓
INF, -INF and
NaN, respectively.
↑
Lexical representations for zero may take a positive or negative sign.
↑
For example, -1E4, 1267.43233E12, 12.78e-2, 12
↑, -0, 0↑
and INF
are all legal literals for double.
The canonical representation for double is defined by prohibiting certain options from the Lexical representation (§3.2.5.1). Specifically, the exponent must be indicated by "E". Leading zeroes and the preceding optional "+" sign are prohibited in the exponent. ↑ If the exponent is zero, it must be indicated by "E0". ↑ For the mantissa, the preceding optional "+" sign is prohibited and the decimal point is required. ↓ For the exponent, the preceding optional "+" sign is prohibited. ↓ Leading and trailing zeroes are prohibited subject to the following: number representations must be normalized such that there is a single digit ↑which is non-zero↑ to the left of the decimal point and at least a single digit to the right of the decimal point ↑ unless the value being represented is zero. The canonical representation for zero is 0.0E0↑.
double has the following ·constraining facets·:
[Definition:] duration represents a duration of time. The ·value space· of duration is a six-dimensional space where the coordinates designate the Gregorian year, month, day, hour, minute, and second components defined in § 5.5.3.2 of [ISO 8601], respectively. These components are ordered in their significance by their order of appearance i.e. as year, month, day, hour, minute, and second.
Note:
All ·minimally conforming· processors ·must· support year values with a minimum of 4 digits (i.e.,YYYY) and a minimum fractional second precision of milliseconds or three decimal digits (i.e. s.sss). However,
·minimally conforming· processors ·may·
set an application-defined limit on the maximum number of digits
they are prepared to support in these two cases, in which case that application-defined
maximum number ·must· be clearly documented.
The lexical representation for duration is the [ISO 8601] extended format PnYn MnDTnH nMnS, where nY represents the number of years, nM the number of months, nD the number of days, 'T' is the date/time separator, nH the number of hours, nM the number of minutes and nS the number of seconds. The number of seconds can include decimal digits to arbitrary precision.
The values of the
Year, Month, Day, Hour and Minutes components are not restricted but
allow an arbitrary
↑unsigned↑ integer↑, i.e., an integer that
conforms to the pattern [0-9]+.↑.
Similarly, the value of the Seconds component
allows an arbitrary ↑unsigned↑ decimal.
↑Following [ISO 8601], at least one digit must
follow the decimal point if it appears. That is, the value of the Seconds component
must conform to the pattern [0-9]+(\.[0-9]+)?.↑
Thus, the lexical representation of
duration does not follow the alternative
format of § 5.5.3.2.1 of [ISO 8601].
An optional preceding minus sign ('-') is allowed, to indicate a negative duration. If the sign is omitted a positive duration is indicated. See also ISO 8601 Date and Time Formats (§D).
For example, to indicate a duration of 1 year, 2 months, 3 days, 10
hours, and 30 minutes, one would write: P1Y2M3DT10H30M.
One could also indicate a duration of minus 120 days as:
-P120D.
Reduced precision and truncated representations of this format are allowed provided they conform to the following:
For example, P1347Y, P1347M and P1Y2MT2H are all allowed; P0Y1347M and P0Y1347M0D are allowed. P-1347M is not allowed although -P1347M is allowed. P1Y2MT is not allowed.
In general, the ·order-relation· on duration is a partial order since there is no determinate relationship between certain durations such as one month (P1M) and 30 days (P30D). The ·order-relation· of two duration values x and y is x < y iff s+x < s+y for each qualified dateTime s in the list below. These values for s cause the greatest deviations in the addition of dateTimes and durations. Addition of durations to time instants is defined in Adding durations to dateTimes (§E).
The following table shows the strongest relationship that can be determined between example durations. The symbol <> means that the order relation is indeterminate. Note that because of leap-seconds, a seconds field can vary from 59 to 60. However, because of the way that addition is defined in Adding durations to dateTimes (§E), they are still totally ordered.
| Relation | |||||||
|---|---|---|---|---|---|---|---|
| P1Y | > P364D | <> P365D | <> P366D | < P367D | |||
| P1M | > P27D | <> P28D | <> P29D | <> P30D | <> P31D | < P32D | |
| P5M | > P149D | <> P150D | <> P151D | <> P152D | <> P153D | < P154D | |
Implementations are free to optimize the computation of the ordering relationship. For example, the following table can be used to compare durations of a small number of months against days.
| Months | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | ... | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Days | Minimum | 28 | 59 | 89 | 120 | 150 | 181 | 212 | 242 | 273 | 303 | 334 | 365 | 393 | ... |
| Maximum | 31 | 62 | 92 | 123 | 153 | 184 | 215 | 245 | 276 | 306 | 337 | 366 | 397 | ... |
In comparing duration values with minInclusive, minExclusive, maxInclusive and maxExclusive facet values indeterminate comparisons should be considered as "false".
Certain derived datatypes of durations can be guaranteed have a total order. For this, they must have fields from only one row in the list below and the time zone must either be required or prohibited.
For example, a datatype could be defined to correspond to the [SQL] datatype Year-Month interval that required a four digit year field and a two digit month field but required all other fields to be unspecified. This datatype could be defined as below and would have a total order.
<simpleType name='SQL-Year-Month-Interval'>
<restriction base='duration'>
<pattern value='P\p{Nd}{4}Y\p{Nd}{2}M'/>
</restriction>
</simpleType>