XML Schema 1.1 Part 2: Datatypes

wd-20060831

W3C Working Draft

31 August 2006 http://www.w3.org/TR/2006/WD-xmlschema11-2-20060217/ XML XHTML with changes since version 1.0 marked XHTML with changes since previous Working Draft marked Independent copy of the schema for schema documents A schema for built-in datatypes only, in a separate namespace Independent copy of the DTD for schema documents List of translations http://www.w3.org/TR/xmlschema11-2/ http://www.w3.org/TR/2006/WD-xmlschema11-2-20060217/ http://www.w3.org/TR/2006/WD-xmlschema11-2-20060116/ http://www.w3.org/TR/2005/WD-xmlschema11-2-20050224/ http://www.w3.org/TR/2004/WD-xmlschema11-2-20040716/ David Peterson invited expert (SGMLWorks!) davep@iit.edu Paul V. Biron Kaiser Permanente, for Health Level Seven Paul.V.Biron@kp.org Ashok Malhotra Oracle Corporation ashokmalhotra@alum.mit.edu C. M. Sperberg-McQueen World Wide Web Consortium cmsmcq@w3.org

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is a member-only review version which will in due course become aLast Call Public Working Draft of XML Schema 1.1: Datatypes. It has no formal standing within W3C; it is here made available for review by W3C members and the public. It is intended to give an indication of the W3C XML Schema Working Group's intentions for this new version of the XML Schema language and our progress in achieving them. It attempts to be complete in indicating what will change from version 1.0, but does not specify in all cases how things will change. This version of this document was created on 31 August 2006. It reflects (unless otherwise noted elsewhere) all decisions on this document made by the Working Group through 23 August 2006.

The text of this draft is essentially that which appeared in the Last Call version of this specification published 17 February 2006. The WG has not approved any changes since that publication; new versions of the status-quo documents have been generated primarily for technical reasons internal to the editorial document-production system.

For those primarily interested in the changes since version 1.0, the appendix, which summarizes both changes already made and also those in prospect, with links to the relevant sections of this draft, is the recommended starting point. An accompanying version of this document displays in color all changes to normative text since version 1.0; another shows changes since the previous Working Draft.

The major changes since version 1.0 include:

Support for XML 1.1 has been added. It is now implementation defined whether datatypes dependent on definitions in and use the definitions as found in version 1.1 or version 1.0 of those specifications.

A new primitive decimal type has been defined, which retains information about the precision of the value. This type is aligned with the floating-point decimal types which will be part of the next edition of IEEE 754.

In order to align this specification with those being prepared by the XSL and XML Query Working Groups, a new datatype named which serves as the base type definition for all primitive atomic datatypes has been introduced.

The conceptual model of the date- and time-related types has been defined more formally.

A more formal treatment of the fundamental facets of the primitive datatypes has been adopted.

More formal definitions of the lexical space of most types have been provided, with detailed descriptions of the mappings from lexical representation to value and from value to .

Changes since the previous Working Draft include the following:

Explicit definitions are provided for the lexical spaces, lexical mappings, and canonical mappings of , , , and . In the case of , the mappings are defined by reference to the relevant RFCs.

The validation rule has been recast in briefer, more declarative form. A paraphrase of the constraint in procedural terms, which corrects some errors in the previous versions of this document, has been added as a note.

The rules governing partial implementations of infinite datatypes have been clarified.

Various changes have been made in order to align the relevant parts of this specification more closely with the corresponding sections of .

In order to correct an error in version 1 of this specification and of , unions are no longer forbidden to be members of other unions. Descriptions of types have also been changed to reflect the fact that unions can be derived by restricting other unions. The concepts of (the members of all members, recursively) and (those datatypes in the transitive membership which are not unions) have been introduced and are used.

An error in the prose descriptions of the lexical spaces of , , , and has been corrected, by allowing for the possibility of a sign.

The schema for schema documents found in has been modified; the source declarations for the built-in datatypes have been removed and placed in a separate appendix (). They do not need to be present in the schema for schema documents, since they are automatically present in any schema.

Comments on this document should be made in W3C's public installation of Bugzilla, specifying "XML Schema" as the product. Instructions can be found at http://www.w3.org/XML/2006/01/public-bugzilla. If access to Bugzilla is not feasible, please send your comments to the W3C XML Schema comments mailing list, www-xml-schema-comments@w3.org (archive) Each Bugzilla entry and email message should contain only one comment.

The end of the Last Call review period is 31 March 2006; comments received after that date will be considered if time allows, but no guarantees can be offered.

Although feedback based on any aspect of this specification is welcome, there are certain aspects of the design presented herein for which the Working Group is particularly interested in feedback. These are designated priority feedback aspects of the design, and identified as such in editorial notes at appropriate points in this draft.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document has been produced by the W3C XML Schema Working Group as part of the W3C XML Activity. The goals of the XML Schema language version 1.1 are discussed in the Requirements for XML Schema 1.1 document. The authors of this document are the members of the XML Schema Working Group. Different parts of this specification have different editors.

This document was produced under the 5 February 2004 W3C Patent Policy. The Working Group maintains a public list of patent disclosures made in connection with this document; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification must disclose the information in accordance with section 6 of the W3C Patent Policy.

The English version of this specification is the only normative version. Information about translations of this document is available at http://www.w3.org/2003/03/Translations/byTechnology?technology=xmlschema.

XML Schema: Datatypes is part 2 of the specification of the XML Schema language. It defines facilities for defining datatypes to be used in XML Schemas as well as other XML specifications. The datatype language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs) for specifying datatypes on elements and attributes.

English Extended Backus-Naur Form (formal grammar) A 'telltale' diff group is for use labeling phrases which tell the reader which proposal / requirement / issue a change is connected with. It's intended for use when we provide only a single version of the spec with diff markup for several relatively small changes. To distinguish changes related to RQ-nnn from those related to RQ-kkk in the same display text, use or (or rather dg="rqnnn-telltale) near the changes. (This assumes the changes aren't anywhere near each other; if they are, perhaps they shouldn't be in the same display.) AFTER THE PRESENTATION FORM OF THE PROPOSALS IS PREPARED, or after the WG approves the change, THE PHRASE ELEMENTS SHOULD GO AWAY, and so should the telltale rqnnn-telltale diff group. The 'telltale' diffgroup is also used (as of 2005-08-25) in the Status section on the list of proposals included. This usage proposal is currently (2005-08-25) experimental, but has been used successfully for a couple of weeks. For brevity, sometimes the suffix -tt is used instead of -telltale. Phrases in status section which apply only to to WG-internal draft copies. (Copied into datatypes.xml from structures.xml, 2005-12-14.) Note to the reader: the following 'phrase' elements are here for use in supplying diff markup in auto-generated material. Do not delete them, and edit them only if you know what you are doing (i.e. if you have reviewed either the output or the stylesheet or both). odiff-add idiff-add ↑ odiff-del idiff-del ↓ with fixed values; these facets must not be changed from the values shown: with the values shown; these facets may be further restricted in the derivation of new types: may also specify values for the following diff group junk: a few homeless targets; should probably ALWAYS BE SHOW unless nothing is, or it is empty Some changes (assumed to be 2e) which were marked without a diff group; this diff group added so as to control them better. diff group fa1: RQ-24 facets proposal, changes made BEFORE the publication of the first public working draft. APPROVED SOME TELECON 2004-10 diff group fa1: RQ-24 facets proposal, changes made AFTER the publication of the first public working draft. APPROVED SOME TELECON 2004-10 diff group cvs1: Constructed Values Appendix (div1) diff group cvs1_pwd: Constructed Values Appendix as a whole (to avoid nested like-named diffs) diff group num1: Numerical Values Appendix (div2); requires cvs1 diff group numap1: in-text productions, etc., first cut; requires funbase, nu1, num1 diff group funbase: The functions appendix in its entirety. ALWAYS ACCEPT OR SHOW diff group nu1: basic numerical functions; requires funbase, num1, cvs1 diff group du0: first Ph 2 for duration; requires numap, nu1, num1, funbase. NOT YET MARKED; APPROVED pre-FPWD year-sec conformance note. Distinguished from du0 to allow special treatment to preserve links. The record is sadly unclear. diff group du0_prodigal marks a paragraph which was inadvertently marked 11 Jan as added by du1 and thus omitted from the Feb. draft, since du1 was not status quo in Feb 2005. It needs to be distinct from the rest of du0 because it needs to shown as added against the Feb WD. diff group du1: second set of revs for duration (compare du2) diff group du2: second set of revs for dayTimeDuration and yearMonthDuration (compare du1) diff group dudt: function for adding duration to dateTime (experimental) diff group dudt_g: movement of prose commentary from the existing appendix G into the new statement of the algorithms (the new locations will be marked as add with this diff group, but I plan to show that diff group as 'post' so as not to color the paragraphs) (experimental) diff group dudt2: revision of prose commentary moved from the existing appendix G into the new statement of the algorithms Corrections suggested by Sandy Gao's review of proposal RQ-122-d for new duration algorithm. diff group dt1: RQ-13 date/time rewrite, first part Ph 2 (d/t app and gDay); requires funbase, nu1, num1; APPROVED 2004-08-27 FTF diff group dt2: RQ-13 date/time rewrite, second part Ph 2 (time and others); requires dt1, funbase, nu1, num1 diff group dtr: date/time nonnormative description (INCLUDES 2 NORMATIVE TABLES); requires dt1 diff group dt3: RQ-13 date/time rewrite, third part Ph 2 (time and others); requires dt1, dt2, funbase, nu1, num1 diff group dt2-3: RQ-13 date/time rewrite, third part of the phase-2 proposal (time and others). This diff group marks (as 'del') a single item which was added in dt2 and then delled in dt3. Accept (i.e. display as "post") as a rule, but reject it (show as "pre") if dt2 is accepted and dt3 is rejected, and show it (show="colour") if dt2 is accepted and dt3 is set to 'show'. diff group dt4: RQ-13 date/time rewrite, fourth part Ph 2 (time and others); requires dt1, dt2, dt3, funbase, nu1, num1 diff group pd1: RQ-31 precisionDecimal first cut for approval; co-requires pre, pd2, pd3; requires pdf diff group pdo: RQ-31 precisionDecimal first cut, deletion of old decimal; co-requires pre, pd1 ,pd3; requires pdf. 2005-01-20: WG chooses two-primitive approach, rejects this change. 2005-01-26: MSM removes this diff group to reduce cruft in the document. diff group pd2: RQ-31 precisionDecimal first cut, addition of new aPDecimal; co-requires pre, pd1, pd2; requires pdf. 2005-01-20: WG chooses two-primitive approach, rejects this change. 2005-01-26: MSM removes this diff group to reduce cruft in the document. diff group pre: Precision Appendix; co-requires pd1, requires num1 and cvs1. Final wording approved (with changes) 2005-02-04. diff group pdf: numerical functions just for precisionDecimal (RQ-31); requires num1 (??). Final wording approved (with changes) 2005-02-04. diff group pdf: numerical functions for precisionDecimal (RQ-31) in two-primitive form. Final wording approved (with changes) 2005-02-04. diff group pdf: numerical functions for precisionDecimal (RQ-31) in single-primitive form. Removed 2005-01-26 after WG chose two-primitive form. diff group aat: anyAtomicType (RQ-???); may require fa1 ?? APPROVED with changes FTF 2004-11-10. Changes decided by WG entered (as aatf), 2005-01-25. Draft final wording approved (with changes) 2005-02-04. diff group aat1: anyAtomicType (RQ-???); requires aat diff group trm1: terminological cleanup begun with tightening meaning of derived (RQ-120); diff group rq31facets: with MSM's proposed changes related to facets of precision decimal. This takes a single-primitive ('unitarian') view of precision decimal and legacy decimal (here under the name aPdecimal). Compatible with both rq31m and rq31u. diff group rq31u: with changes for a one-primitive ('unitarian') version of precision decimal. Incompatible with: rq31m, which takes the manichean view. Assumes: pd1, pd2, pre, pdf, num1, pdo(which deletes old decimal), pd2 (which inserts new aPDecimal). The WG chose the Manichean decimal proposal over the Unitarian one, 2005-01-20. Diffs for group rq31u were removed 2005-01-26. diff group rq31m: with changes for a two-primitive ('manichean') version of precision decimal. Incompatible with: rq31u, which takes the unitarian view, pdo, which deletes old decimal, pd2, which inserts new aPDecimal. Assumes: pd1, pre, pdf, num1. Final wording approved (with changes) 2005-02-04. diff group fa1-fix: MSM's proposed changes for fixing problems (missing term definitions, in particular) caused by the fact that fa1 was incomplete and left the document in an unstable state. diff group iff: with an editorial proposal (2005-01-01) for being more consistent about the use of conditionals and biconditionals. When terms are being defined (whether or not marked as termdefs) or necessary and sufficient conditions for some state are being given (e.g. in constraint notes, which define terms like 'facet valid with respect to X'), this diff group proposes to use 'if' only for conditions which are sufficient but not necessary; if the conditions are both sufficient and necessary, then use 'if and only if'. diff group pdf_tweak: for proposed improvements to diff group pdf (all gone away now, and then come back again). Final wording approved (with changes) 2005-02-04. diff group review: for marking stuff that is really intended only for editorial review (usually to be used on ednotes). diff group wdd: for working-draft deviations: changes between the publication of the first public WD in July and the advent of thorough and permanent change markup. (Diff group wdd begun 9 January 2005, but diff not completed. It was looking like another three hours work.) I.e. wdd should mark all and only those differences between TR/2004/WD-xmlschema11-2-20040716/datatypes.xml and xse/datatypes/datatypes.xml which are not already marked. When we run the result through the dg.xsl filter with wdd set to reject, the result should be (modulo whitespace and other non-significant differences) substantively the same as the public WD. diff group dpno: change proposals transferred into this file from the experimental fork datatypes.newOrg.xml. At the moment, the quasi-systematic changes of ID have not been reproduced. diff group fpwd-rescinded-add: marks some paragraphs added in the first public working draft but since deleted again. diff group fpwd-rescinded-del: marks some paragraphs marked as deleted in the first public working draft but since restored. diff group aatf: anyAtomicType (RQ-141). Changes decided on by WG at Redwood Shores ftf 2004-11-10. Draft final wording approved (with changes) 2005-02-04. diff group aatj: anyAtomicType (RQ-141). Proposal for change, submitted to WG at Brisbane, January 2005 (hence the 'j'). Final wording approved (with changes) 2005-02-04. (The single use of this got commented out later, I suspect because it was merely a change to a non-sq proposal and doesn't need special processing in future publications. I leave this entry here and the commented-out text in the body of the doc out of a sense of historical piety or something. Later we'll rip them out.) diff group aatg: anyAtomicType (RQ-141). Changes to correct errors found in review of aatf, including changes agreed by WG in telcon of 2005-02-04 when the RQ-141 proposal was approved. diff group vrd: make validation rules declarative. Not yet complete. Stems from rq31m edits: first cut at editing the upper and lower bounds facets included reformulation of the validation rules to talk about numeric value. When the order relation for numeric values and pDecimal values was defined, however, it became clear that the validation rules didn't need that change, and the remaining change (making them declarative) didn't really have anything to do with anyAtomicType. diff group fpwd: used to mark things that changed between 1.0 2E and the first public working draft of July 2004. (N.B. issues elements and editorial notes are not consistently marked as added. They may consistently be unmarked.) diff group rq001: marks a phase-2 proposal to resolve requirement RQ-001, adopted by the WG on 2 March 2004. diff group rq31fix: marks some wording changes intended to address problems identified by Dave Peterson, Sandy Gao, and Noah Mendelsohn after the draft final wording for RQ-31 went to the WG. Micro-component-related changes (no longer in use here, I think) Micro-component-related changes specifically in part 2. Split off from the preceding 2005-06-07, so that the status quo for Structures can continue to show ep01 as a non-status-quo change, while for Datatypes it can be suppressed silently. Hack for section 4.4, added by EP-01 and then removed from the EP-01 proposal. On 2005-08-29 MSM believes section 4.4 was never part of the status quo and can be deleted entirely. This hack is temporary, and only needs to be kept while we still have a residue of doubt about the question. Last-minute hacks to make the Working Draft of February 2005 be valid and produce valid clean HTML. MSM's draft phase-2 proposal for RQ-120, which uses 'ordinary' as the general term for non-special types and 'constructed' as the general term for types of classes 3 and 4. Bits of the RQ-120 proposal which should only be included if rq120o is excluded. An alternate form of the phase-2 proposal for RQ-120, which uses 'ordinary' for classes 3 and 4. Changes made in the course of our work on RQ-120 that were not marked as changes at the time. Some from version 152 (davep), some from version 144 (cmsmcq). Found and marked 26 August 2005. lm.rel: For making lex maps not functions. flfix: value/lexical space and lex/canon mappings for float and double. flfix-tt: comments only for flfix approval cycle. lp: Introduction of 'literate programming' markup. A purely internal change: no change to presentation or substance of the material. Thus no need for WG review or approval; marked here solely for editorial convenience. Addition of type definitions for the two new totally ordered subtypes of duration; completes satisfaction of RQ-20. Changes made en passant in the schema document for schema documents and the DTD for schema documents, while doing RQ-20. Changes which had been made in the externally stored version of the schema document for schema documents and the DTD for schema documents, which MSM found while doing RQ-20 and which MSM proposes to roll back (subject to objection from the other editors). To roll them back, display rq20rb as 'pre'. Addition of id attributes on the declarations of the two new totally ordered subtypes of duration, done on editorial initiative 2006-01-09. Changes for RQ-123 (allow year 0000). Approved 17 June 2005. Resolution of issue wd-17 (changes to description of value space of duration), including amendments of 29 April 2005. Changes for task 2-122-a remove leap seconds. Decided by WG at Tech Plenary, action TP5-4. Corrections arising from Sandy Gao's comments on proposal RQ-122a. Changes originally added by dt3 and deleted by noleap. It was set as "del" for generating the noleap proposal. After the proposal was accepted, the single occurrence of this diff group was changed to "add". The dg-approved file should (and does) show this dg as "pre"). Changes for alternate text: leap second support is implementation defined. (Prepared on spec, in case the WG changes plans.) The SQL spec says Whether an SQL-implementation supports leap seconds, and the consequences of such support for date and interval arithmetic, is implementation-defined. Short and sweet. Addition of proper bibliographic reference to ITU-R TF.460-6, which defines UTC. Correction to timeOnTimeline function (off by one error in step 3b). Fix for wd-2 (add value constraint to list of duties performed by identity checking). Fix for wd-3 (wording in section 2.2.1 about identity of values across types) Fix for wd-4. Fix for wd-5. Fix for wd-11 (fundamental facets). Fix for wd-19 (base64). Fix for wd-2 (canonical form health warnings for QName and NOTATION). Fix for wd-23 (misleading / erroneous labels in the table of applicable facets). Most changes are actually in dg.xsl and xmlschema.xsl, not here. Approved 2005-06-17. Fix for wd-25 (pointing to IRI spec). Fix for handling of partial implementations. Post-hoc marking of changes made at the same time as partialfix (rev 1.7.2.183) which were accidentally left unmarked. Minor repair to the partial-fix proposal agreed on during ftf meeting 30 January 2006. Material introduced by 150c which was deleted (moved) by partialfix. Fix for facet-sensitive canonical mapping for decimal, and text cleanup. (RQ-150, part of RQ-21) Movement of data for decfix diff group. Show as pre or post, not as colour (unless you want to make it impossible to follow the fine-grained changes). Movement of data for decfix diff group. This paragraph was originally added by rq31m, and moved by decfix. It was then deleted by rq21-lexmaps. Show accordingly. Lexmaps for context-sensitive QName and ENTITY, and Canonmap for facet-sensitive decimal. RQ-141b Changes to reconcile overlap/conflict between parts 1 and 2 -- various items. Much of this was accepted by the WG in Edinburgh 2005-09 as part of the omnibus proposal of 31 August. The parts that were not have been relabeled rec12-main-excepted. RQ-141b Changes to reconcile overlap/conflict between parts 1 and 2 -- various items which were not accepted as part of the omnibus proposal of 31 August (they were 'excepted' from the decision to approve the omnibus). RQ-141b Changes to reconcile overlap/conflict between parts 1 and 2 -- provide better facet information in section 3 RQ-141b Changes to reconcile overlap/conflict between parts 1 and 2 -- fix long-standing problems with mapping rules in 4.1.2. Every section in which this diff group appears was excepted from the approval of the omnibus proposal in Edinburgh. (That is why this diff group has not been split in two the way rec12-main has been.) define 'ancestor' dpno.del.rec12-map.del: portmanteau hack for paragraphs deleted by dpno (but still not in the status quo) and deleted (again) by rec12-map; these are currently marked del, so make this diff group's disposition equal to that of rec12-map. dpno.add.rec12-map.del: portmanteau hack for paragraphs added by dpno and deleted by rec12-map. These are currently marked 'add', so to show rec12-map, make this diff group 'pre'. A telltale for sections with a lot of rec12-map and rec12-main changes. rq100: changes to achieve requirement RQ-100 canonical form for language. Approved with changes 2005-08-26. wd-23-bis: another attempt to fix the table of applicable facets wd26: restore the notion of 'duration' in describing timezones. Approved 1 July 2005. wd31: recast sentences about to Unicode database changes. Approved 1 July 2005. sfs-cleaning: trying to synch the schema for schemas with recent changes found when checking ht's rec12 changes rq21-lit: define 'literal' properly and use it to denote the members of lexical spaces (some diffs are in local.dtd, specifically the change in the entity declaration 'string' from 'character string' to 'literal'). When this diff group goes to the WG, we must remember to note that the change proposed includes changing each occurrence of the word 'literal' in the running prose (not in the tableaux) to a termref. Those changes can be reverted easily and we would like not to show them in the diffed WD. (They also lack any telltale in the formatted proposal.) Approved by WG 26-28 Sept 2005 in Edinburgh as part of omnibus proposal of 31 August. rq21-lexmaps: editorial proposal to make references to lexical and canonical mappings lighter weight. Approved by WG 26-28 Sept 2005 in Edinburgh as part of omnibus proposal of 31 August. rq31m.add.rq21-lexmaps.del: special diff group for material added by rq31m and re-deleted by rq21-lexmaps. For this and the following analogous diff groups, be careful how you color them. Approved by WG 26-28 Sept 2005 in Edinburgh as part of omnibus proposal of 31 August. rq31fix.add.rq21-lexmaps.del: special diff group for material added by rq31fix and re-deleted by rq21-lexmaps. Approved by WG 26-28 Sept 2005 in Edinburgh as part of omnibus proposal of 31 August. du0.prodigal.add.rq21-lexmaps.del: special diff group for material added by du0.prodigal and re-deleted by rq21-lexmaps. Approved by WG 26-28 Sept 2005 in Edinburgh as part of omnibus proposal of 31 August. du0.prodigal.add.rq21-lexmaps.del: special diff group for material added by du0.prodigal and by rq21-lexmaps. Show as colour for the second of these to be considered. Approved by WG 26-28 Sept 2005 in Edinburgh as part of omnibus proposal of 31 August. du0.prodigal2.add.rq21-lexmaps.del: special diff group for material added by du0.prodigal2 and re-deleted by rq21-lexmaps. It could probably have been changed silently, but I wanted to be careful. Since prodigal2 has not been approved at the time rq21-lexmaps is proposed, the disposition file will leave this as pre. If rq21-lexmaps is approved, this group should never become colour or post; see next item. rq21-lexmaps WAS approved by WG 26-28 Sept 2005 in Edinburgh as part of omnibus proposal of 31 August, so this should never be colour or post, always pre. [Correction, 2006-01-10: the material marked prodigal2 was included in the duration proposal approved by the WG on 18 December 2003. So today I have reviewed all the diff groups related to it and made sure prodigal2 appears in the status quo documents (for the first time in a long time, if ever) unless later overridden. Leave this one pre.] du0.prodigal2.add.rq21-lexmaps.del: special diff group for material added by du0.prodigal2 and revised by rq21-lexmaps. If and when du0_prodigal2 is sent to the WG, make this show as a colored addition. [No, p2 was approved. Show this approved after 200502] dt3.add.rq21-lexmaps.del: special diff group for material added by dt3 and re-deleted by rq21-lexmaps. Approved by WG 26-28 Sept 2005 in Edinburgh as part of omnibus proposal of 31 August. rq21-string: new lexical/canonical mappings for string; requires moreFunctions DG as well. Approved by WG 26-28 Sept 2005 in Edinburgh. rq21-string-hack: dummy diff group: for some text which has moved, avoid showing it as new text in the new location, but mark it as an add so that if rq21-string is rejected, the movement can be rejected. If rq21-string is set to post, set this to post. If to pre, set this to pre. If to colour, set this to post. b1902amend: amendments to the proposal for bug 1902 (RQ-21 for string) approved in Edinburgh. rq21-boolean: new lexical/canonical mappings for boolean; requires moreFunctions DG as well. Approved by WG 26-28 Sept 2005 in Edinburgh as part of omnibus proposal of 31 August. moreFunctions: adds another subsection after numeric and d/t in functions appendix; show this whenever rq21-string, rq21-boolean, or context are to be shown. du0_prodigal2: continues resurrection of lost duration stuff. [Note, 2006-01-10: the material marked prodigal2 was included in the duration proposal approved by the WG on 18 December 2003. It has not been included in recent status-quo documents, though, because until today it was not clear that the material was in fact reviewed and approved.] rq126: health warning about restricting away canonical forms. rq-140: distinguishing negative from positive zero. An attempt at a minimum-needed proposal added by MSM. Approved by WG 26-28 Sept 2005 in Edinburgh as part of omnibus proposal of 31 August. rq-150c: require minimum support in precisionDecimal. Approved with amendments by WG 26-28 Sept 2005 in Edinburgh. rq152a: quick initial change of 'XML 1.0' to 'XML' in abstract, done long ago (2004?) but not associated with a diff group til 2005-08-27. (Other appearances of that string are machine-generated for bibliographic references, don't get diff markup.) Editorial proposal to replace all occurrences of 'pattern' as a reference to a defined term with references to the component. (Need similar proposal for all the others ...) pattern-1929: pattern value as set. Follow-on proposal from RQ-141 reconciliation effort. Make pattern facet have a set-valued {value} in place of a regex-valued {value}, so we can have just one. deletion of cross references to src-multiple-patterns and src-multiple-enumerations. The references should have been deleted as part of pattern-1929, since their targets were deleted then. Remove lexicalMappings facet. 2006-01-14: diff attributes on this diff group changed from del (vis a vis 2005-02) to add (vis a vis 1.0). Portmanteau for former additions destined for rollback to actual deletions of lexicalMappings facet, ref. bug 1912 Portmanteau for former addition for precision to actual deletion of lexicalMappings facet, ref. bug 1912. 2006-01-14: The material was first published in February 2005; with the publication of the new WD in January 2006, we no longer need to be able to show this as a deletion: vis-a-vis 1.0 it's not a deletion, but a rejected addition / an addition later reverted. So I've changed the polarity of its diff attribute from del to add. RQ-21 (define lexical space and lexical mappings more precisely) for specials. Approved with amendments 2006-01-13 (amendments are not marked separately). Replacing {scope} with {context} on stds, ref. bug 2337. Material added by proposal EP-02 part 2, but deleted by later context-2337 proposal. This material was never status-quo. It's shown as an ADD at the moment. Amendments to rec12-map agreed at Edinburgh f2f 2005-09-26 Amendments to rec12-main agreed at Edinburgh f2f 2005-09-26, in sections which were not, ultimately, approved in Edinburgh. Amendments to rec12-main agreed at Edinburgh f2f 2005-09-26, in sections which were ultimately approved in Edinburgh (notably appendix A). Simplify mapping rules for enumeration facet parallel to changes agreed for pattern Material in the enumeration sections not approved in Edinburgh (explicitly excepted) but also not actually incorporated into the enumeration proposal (it was marked nsq, not colour). I'm going to swallow hard, though, and treat it as approved anyway. Correct Char production (was 10, now 51 or something) of regex grammar to describe metacharacters correctly. Minor editorial change to resolve bug 2603. Approved by the WG 2005-12-16. Remove built-in and derived primitives from sForS proper, relocate in separate appendices/non-schema-documents Auxiliary diff group for bug 1933, to mark the movement of data. Normally this should be pre if sfs-1933 is pre, post if sfs-1933 is post or colour. Deprecate XMLSchema-datatypes. Wording accepted without change 2006-01-13. 27 December, reviewing bugs marked 'decided', I found some changes which had been agreed on in Edinburgh which had not been made. If any of these got re-deleted after Edinburgh, I don't know about it. 27 December, reviewing bugs marked 'decided', I found some changes I would like to make in HT's execution of the instructions from Edinburgh. These changes should (aaaughh!) probably go to the WG. Health warning about use of whitespace facet for tokenizing natural-language data. Approved for 1.1 in Toronto (but overlooked by MSM when processing Toronto minutes). Bugzilla 2044, R-198, unions of unions Feedback request for bugzilla 2044, R-198, unions of unions Temporary hack for Bugzilla 1838, RQ-152, alignment with XML 1.1. We don't have final wording on this yet, but I've updated the reference to point to XML 1.1, which is at least consistent with the most recent clear WG decision on the matter. Correct typo in description of double range: 2**53 not 2**57. Editorial hacks for publication of WD, 2006-01. These should be shown coloured in diffed versions. Editorial hacks for publication of WD, 2006-01 which must be silent, NOT shown: Correct / work around link rot. (We can't show the old rotten links as links, because they are bad and will raise linkcheck errors.) Bug 2313 Tie precisionDecimal to IEEE754R more clearly. Wording approved 13 January 2006. Bug 2627 bad reference in appendix D. Wording approved 13 January 2006. Bug 2179 R-185: Question about cardinality of calendar types (Jeremy Carroll pointed out that gMonthDay, gMonth, and gDay are not infinite.) Wording approved 13 January 2006. Bug 1834: clarification of lexical space of unsigned types. Material in *.nxsd which belongs in the files, but not in the REC Exploration of removal of annotations from the builtins in C1 and C2 and consequent fixups. [N.B. these aren't all showing as changes in the output, because some of them are within non-status-quo text.] RQ-21 / Bugzilla 1910 lexical mapping for hexBinary Silent changes (transpositions) for RQ-21 / Bugzilla 1910 lexical mapping for hexBinary Changes for RQ-21 / Bugzilla 1910 which MSM would rather avoid RQ-21 / Bugzilla 1911 lexical mapping for base64Binary Silent changes (transpositions) for RQ-21 / Bugzilla 1911 lexical mapping for base64Binary Changes for Bugzilla 1838 = RQ-152 = support for XML 1.1. Supersedes rq152temp (reverses some of rq152temp, calls out some, but diffs against status quo without rq152temp, not with it. Changes for Bugzilla 2449 Datatype valid is broken. Not sure how close this is to what people had in mind. Changes for Bugzilla 2449 take VI. Text movement changes for Bugzilla 2449. End-game resolution of dangling inconsistencies between parts 1 and 2 Repair error made July 2004 when resynching this document with late change to 1.0 2E. Part of the 1852 proposal we withdrew before adopting it. There's only one, and we should probably delete it entirely, but I want the rescission to be part of the CVS record first. -MSM Resolution of bugzilla 2250, consistency of formulation for min/max inc/exclusive. Approved by WG 30 January 2006. Resolution of bugzilla 1893, R-203: Inconsistency with constraints on min/maxExclusive. Approved by WG 30 January 2006. Material added in first public working draft and re-deleted for last call. wd4hax: miscellaneous editorial changes for Last Call draft wd4edhax: misc editorial changes prior to Last Call, based on NM's comments but not approved by WG or other editors Addition of ptd, itd, and mtd to display for simple type definitions. This happened in the WD of 200502 and should be marked as an add vis-a-vis that WD and vis-a-vis 1.0. Type Definition component Annotations component Complex Type Definition component Attribute Declaration Element Declaration

Introduction Introduction to Version 1.1

The Working Group has two main goals for this version of W3C XML Schema:

Significant improvements in simplicity of design and clarity of exposition without loss of backward or forward compatibility;

Provision of support for versioning of XML languages defined using the XML Schema specification, including the XML transfer syntax for schemas itself.

These goals are slightly in tension with one another -- the following summarizes the Working Group's strategic guidelines for changes between versions 1.0 and 1.1:

Add support for versioning (acknowledging that this may be slightly disruptive to the XML transfer syntax at the margins)

Allow bug fixes (unless in specific cases we decide that the fix is too disruptive for a point release)

Allow editorial changes

Allow design cleanup to change behavior in edge cases

Allow relatively non-disruptive changes to type hierarchy (to better support current and forthcoming international standards and W3C recommendations)

Allow design cleanup to change component structure (changes to functionality restricted to edge cases)

Do not allow any significant changes in functionality

Do not allow any changes to XML transfer syntax except those required by version control hooks and bug fixes

The overall aim as regards compatibility is that

All schema documents conformant to version 1.0 of this specification should also conform to version 1.1, and should have the same validation behavior across 1.0 and 1.1 implementations (except possibly in edge cases and in the details of the resulting PSVI);

The vast majority of schema documents conformant to version 1.1 of this specification should also conform to version 1.0, leaving aside any incompatibilities arising from support for versioning, and when they are conformant to version 1.0 (or are made conformant by the removal of versioning information), should have the same validation behavior across 1.0 and 1.1 implementations (again except possibly in edge cases and in the details of the resulting PSVI);

Purpose

The specification defines limited facilities for applying datatypes to document content in that documents may contain or refer to DTDs that assign types to elements and attributes. However, document authors, including authors of traditional documents and those transporting data in XML, often require a higher degree of type checking to ensure robustness in document understanding and data interchange.

The table below offers two typical examples of XML instances in which datatypes are implicit: the instance on the left represents a billing invoice, the instance on the right a memo or perhaps an email message in XML.

Data oriented	Document oriented
<invoice> <orderDate>1999-01-21</orderDate> <shipDate>1999-01-25</shipDate> <billingAddress> <name>Ashok Malhotra</name> <street>123 Microsoft Ave.</street> <city>Hawthorne</city> <state>NY</state> <zip>10532-0000</zip> </billingAddress> <voice>555-1234</voice> <fax>555-4321</fax> </invoice>	<memo importance='high' date='1999-03-23'> <from>Paul V. Biron</from> <to>Ashok Malhotra</to> <subject>Latest draft</subject> <body> We need to discuss the latest draft <emph>immediately</emph>. Either email me at <email> mailto:paul.v.biron@kp.org</email> or call <phone>555-9876</phone> </body> </memo>

The invoice contains several dates and telephone numbers, the postal abbreviation for a state (which comes from an enumerated list of sanctioned values), and a ZIP code (which takes a definable regular form). The memo contains many of the same types of information: a date, telephone number, email address and an "importance" value (from an enumerated list, such as "low", "medium" or "high"). Applications which process invoices and memos need to raise exceptions if something that was supposed to be a date or telephone number does not conform to the rules for valid dates or telephone numbers.

In both cases, validity constraints exist on the content of the instances that are not expressible in XML DTDs. The limited datatyping facilities in XML have prevented validating XML processors from supplying the rigorous type checking required in these situations. The result has been that individual applications writers have had to implement type checking in an ad hoc manner. This specification addresses the need of both document authors and applications writers for a robust, extensible datatype system for XML which could be incorporated into XML processors. As discussed below, these datatypes could be used in other XML-related standards as well.

Dependencies on Other Specifications

Other specifications on which this one depends are listed in .

This specification defines some datatypes which depend on definitions in and ; those definitions, and therefore the datatypes based on them, vary between version 1.0 (, ) and version 1.1 (, ) of those specifications. In any given use of this specification, the choice of the 1.0 or the 1.1 definition of those datatypes is implementation-defined.

Conforming implementations of this specification may provide either the 1.1-based datatypes or the 1.0-based datatypes, or both. If both are supported, the choice of which datatypes to use in a particular assessment episode should be under user control.

When this specification is used to check the datatype validity of XML input, implementations may provide the heuristic of using the 1.1 datatypes if the input is labeled as XML 1.1, and using the 1.0 datatypes if the input is labeled 1.0, but this heuristic should be subject to override by users, to support cases where users wish to accept XML 1.1 input but validate it using the 1.0 datatypes, or accept XML 1.0 input and validate it using the 1.1 datatypes.

Requirements

The document spells out concrete requirements to be fulfilled by this specification, which state that the XML Schema Language must:

provide for primitive data typing, including byte, date, integer, sequence, SQL and Java primitive datatypes, etc.;

define a type system that is adequate for import/export from database systems (e.g., relational, object, OLAP);

distinguish requirements relating to lexical data representation vs. those governing an underlying information set;

allow creation of user-defined datatypes, such as datatypes that are derived from existing datatypes and which may constrain certain of its properties (e.g., range, precision, length, format).

Scope

This portion of the XML Schema Language discusses datatypes that can be used in an XML Schema. These datatypes can be specified for element content that would be specified as #PCDATA and attribute values of various types in a DTD. It is the intention of this specification that it be usable outside of the context of XML Schemas for a wide range of other XML-related activities such as and .

Terminology

The terminology used to describe XML Schema Datatypes is defined in the body of this specification. The terms defined in the following list are used in building those definitions and in describing the actions of a datatype processor:

for compatibility

A feature of this specification included solely to ensure that schemas which use this feature remain compatible with

may

Conforming documents and processors are permitted to but need not behave as described.

match

(Of strings or names:) Two strings or names being compared must be identical. Characters with multiple possible representations in ISO/IEC 10646 (e.g. characters with both precomposed and base+diacritic forms) match only if they have the same representation in both strings. No case folding is performed. (Of strings and rules in the grammar:) A string matches a grammatical production if and only if it belongs to the language generated by that production.

must

Conforming documents and processors are required to behave as described; otherwise they are in .

error

A violation of the rules of this specification; results are undefined. Conforming software detect and report an error and recover from it.

Constraints and Contributions

This specification provides three different kinds of normative statements about schema components, their representations in XML and their contribution to the schema-validation of information items:

Constraint on Schemas

Constraints on the schema components themselves, i.e. conditions components satisfy to be components at all. Largely to be found in .

Schema Representation Constraint

Constraints on the representation of schema components in XML. Some but not all of these are expressed in and .

Validation Rule

Constraints expressed by schema components which information items satisfy to be schema-valid. Largely to be found in .

TypeDatatype System

This section describes the conceptual framework behind the datatype system defined in this specification. The framework has been influenced by the standard on language-independent datatypes as well as the datatypes for and for programming languages such as Java.

The datatypes discussed in this specification are computer representations offor the most part well known abstract concepts such as integer and date. It is not the place of this specification to thoroughly define these abstract concepts; many other publications provide excellent definitions. However, this specification will attempt to describe the abstract concepts well enough that they can be readily recognized and distinguished from other abstractions with which they may be confused.

Only those operations and relations needed for schema processing are defined in this specification. Applications using these datatypes are generally expected to implement appropriate additional functions and/or relations to make the datatype generally useful. For example, the description herein of the datatype does not define addition or multiplication, much less all of the operations defined for that datatype in on which it is based. For some datatypes (e.g. or ) defined in part by reference to other specifications which impose constraints not part of the datatypes as defined here, applications may also wish to check that values conform to the requirements given in the current version of the relevant external specification.

Datatype

In this specification, a datatype is a 3-tuple, consisting of a) a set of distinct values, called its , b) a set of lexical representations, called its , and c) a set of s that characterize properties of the , individual values or lexical items.

In this specification, a datatype has three properties:

A , which is a set of values.

A , which is a set of literals used to denote the values.

A small collection of functions, relations, and procedures associated with the datatype. Included are equality and order relations on the , and a , which is a function on the onto the .

This specification only defines the operations and relations needed for schema processing. The choice of terminology for describing/naming the datatypes is selected to guide users and implementers in how to expand the datatype to be generally useful—i.e., how to recognize the real world datatypes and their variants for which the datatypes defined herein are meant to be used for data interchange.

Along with the it is often useful to have an inverse which provides a standard for each value. Such a is not required for schema processing, but is described herein for the benefit of users of this specification, and other specifications which might find it useful to reference these descriptions normatively. For some datatypes, notably and , the mapping from lexical representations to values is context-dependent; for these types, no is defined.

Where canonical mappings are defined in this specification, they are defined for datatypes. When a datatype is derived using facets which directly constrain the , then for each value eliminated from the , the corresponding lexical representations are dropped from the lexical space. The for such a datatype is a subset of the for its type and provides a for each value remaining in the .

The facet, on the other hand, restricts the directly. When more than one lexical representation is provided for a given value, the facet may remove the while permitting a different lexical representation; in this case, the value remains in the but has no . This specification provides no recourse in such situations. Applications are free to deal with it as they see fit.

Value space

A value space is the set of values for a given datatype. Each value in the value space of a datatype is denoted by one or more literals in its .

The value space of a datatype is the set of values for that datatype. Associated with each value space are selected operations and relations necessary to permit proper schema processing. Each value in the value space of a datatype is denoted by one or more character strings in its , according to the lexical mapping. (If the mapping is restricted during a derivation in such a way that a value has no denotation, that value is dropped from the value space.)

The value spaces of datatypes are abstractions, and are defined in to the extent needed to clarify them for readers. For example, in defining the numerical datatypes, we assume some general numerical concepts such as number and integer are known. In many cases we provide references to other documents providing more complete definitions.

The value spaces and the values therein are abstractions. This specification does not prescribe any particular internal representations that must be used when implementing these datatypes. In some cases, there are references to other specifications which do prescribe specific internal representations; these specific internal representations must be used to comply with those other specifications, but need not be used to comply with this specification.

In addition, other applications are expected to define additional appropriate operations and/or relations on these value spaces (e.g., addition and multiplication on the various numerical datatypes' value spaces), and are permitted where appropriate to even redefine the operations and relations defined within this specification, provided that for schema processing the relations and operations used are those defined herein.

The of a given datatype can be defined in one of the following ways:

defined elsewhere axiomatically from fundamental notions (intensional definition) [see ]

enumerated outright from values of an already defined datatype (extensional definition) [see ]

defined by restricting the of an already defined datatype to a particular subset with a given set of properties [see derived]

defined as a combination of values from one or more already defined (s) by a specific construction procedure [see and ]

value spaces have certain properties. For example, they always have the property of , some definition of equality and might be , by which individual values within the can be compared to one another. The properties of value spaces that are recognized by this specification are defined in .

The relations of identity, equality, and order are required for each value space. A very few datatypes have other relations or operations prescribed for the purposes of this specification.

Identity

The identity relation is always defined. Every value space inherently has an identity relation. Two things are identical if and only if they are actually the same thing: i.e., if there is no way whatever to tell them apart. The identity relation is used when making facet-based restrictions by enumeration, when checking identity constraints, and when checking value constraints. These are the only uses of identity for schema processing.

This does not preclude implementing datatypes by using more than one internal representation for a given value, provided no mechanism inherent in the datatype implementation (i.e., other than bit-string-preserving "casting" of the datum to a different datatype) will distinguish between the two representations.

In the identity relation defined herein, values from different datatypes' value spaces are made artificially distinct if they might otherwise be considered identical. For example, there is a number two in the datatype and a number two in the datatype. In the identity relation defined herein, these two values are considered distinct. Other applications making use of these datatypes may choose to consider values such as these identical, but for the view of datatypes' value spaces used herein, they are distinct.

WARNING: Care must be taken when identifying values across distinct primitive datatypes. The literals 0.1 and 0.10000000009 map to the same value in (neither is in the value space, and each is mapped to the nearest value, namely 0.100000001490116119384765625), but map to distinct values in .

Equality

Each datatype has prescribed an equality relation for its value space. The equality relation for most datatypes is the identity relation. In the few cases where it is not, equality has been carefully defined so that for most operations of interest to the datatype, if two values are equal and one is substituted for the other as an argument to any of the operations, the results will always also be equal.

On the other hand, equality need not cover the entire value space of the datatype (though it usually does). In particular, NaN <> NaN in the , , and datatypes.

The equality relation is used in conjunction with order when making facet-based restrictions involving order. This is the only use of equality for schema processing.

In the prior version of this specification (1.0), equality was always identity. This has been changed to permit the datatypes defined herein to more closely match the real world datatypes for which they are intended to be used as transmission formats.

For example, the datatype has an equality which is not the identity ( −0 = +0 , but they are not identical—although they were identical in the 1.0 version of this specification), and whose domain excludes one value, NaN, so that NaN ≠ NaN .

For another example, the datatype previously lost any timezone information in the as the value was converted to ; now the timezone is retained and two values representing the same moment in time but with different remembered timezones are now equal but not identical.

In the equality relation defined herein, values from different primitive data spaces are made artificially unequal even if they might otherwise be considered equal. For example, there is a number two in the datatype and a number two in the datatype. In the equality relation defined herein, these two values are considered unequal. Other applications making use of these datatypes may choose to consider values such as these equal (and must do so if they choose to consider them identical); nonetheless, in the equality relation defined herein, they are unequal.

For the purposes of this specification, there is one equality relation for all values of all datatypes (the union of the various datatype's individual equalities, if one consider relations to be sets of ordered pairs). The equality relation is denoted by = and its negation by ≠, each used as a binary infix predicate: x = y and x ≠ y . On the other hand, identity relationships are always described in words.

Order

Each datatype has an order relation prescribed. This order may be a partial order, which means that there may be values in the which are neither equal, less-than, nor greater-than. Such value pairs are incomparable. In many cases, the prescribed order is the null order: the ultimate partial order, in which no pairs are less-than or greater-than; they are all equal or . Two values that are neither equal, less-than, nor greater-than are incomparable. Two values that are not are comparable. The order relation is used in conjunction with equality when making facet-based restrictions involving order. This is the only use of order for schema processing.

In this specification, this less-than order relation is denoted by < (and its inverse by >), the weak order by ≤ (and its inverse by ≥), and the resulting relation by <>, each used as a binary infix predicate: x < y , x ≤ y , x > y , x ≥ y , and x <> y .

The weak order less-than-or-equal means less-than or equal and one can tell which. For example, the P1M (one month) is not less-than-or-equal P31D (thirty-one days) because P1M is not less than P31D, nor is P1M equal to P31D. Instead, P1M is with P31D.) The formal definition of order for () insures that this is true.

The value spaces of primitive datatypes are abstractions, which may have values in common. In the order relation defined herein, these value spaces are made artificially . For example, the numbers two and three are values in both the precisionDecimal datatype and the float datatype. In the order relation defined herein, two in the decimal datatype and three in the float datatype are incomparable values. Other applications making use of these datatypes may choose to consider values such as these comparable.

While it is not an error to attempt to compare values from the value spaces of two different primitive datatypes, they will always be and therefore unequal: If x and y are in the value spaces of different primitive datatypes then x <> y (and hence x ≠ y ).

Lexical space

In addition to its , each datatype also has a lexical space.

A lexical space is the set of valid literals for a datatype.

For example, "100" and "1.0E2" are two different literals from the of which both denote the same value. The type system defined in this specification provides a mechanism for schema designers to control the set of values and the corresponding set of acceptable literals of those values for a datatype.

The literals in the lexical spaces defined in this specification have the following characteristics:

Interoperability:

The number of literals for each value has been kept small; for many datatypes there is a one-to-one mapping between literals and values. This makes it easy to exchange the values between different systems. In many cases, conversion from locale-dependent representations will be required on both the originator and the recipient side, both for computer processing and for interaction with humans.

Basic readability:

Textual, rather than binary, literals are used. This makes hand editing, debugging, and similar activities possible.

Ease of parsing and serializing:

Where possible, literals correspond to those found in common programming languages and libraries.

Canonical Lexical Representation

While the datatypes defined in this specification have, for the most part, a single lexical representation i.e. each value in the datatype's is denoted by a single literal in its , this is not always the case. The example in the previous section showed two literals for the datatype which denote the same value. Similarly, there be several literals for one of the date or time datatypes that denote the same value using different timezone indicators.

A canonical lexical representation is a set of literals from among the valid set of literals for a datatype such that there is a one-to-one mapping between literals in the canonical lexical representation and values in the .

The Lexical Space and Lexical Mapping

The lexical mapping for a datatype is a prescribed function whose domain is a prescribed set of character strings (the ) and whose range is the of that datatype.

The lexical space of a datatype is the prescribed domain of the lexical mapping for that datatype.

The members of the are lexical representations of the values to which they are mapped.

A sequence of zero or more characters in the Universal Character Set (UCS) which may or may not prove upon inspection to be a member of the of a given datatype and thus a of a given value in that datatype's , is referred to as a literal. The term is used indifferently both for character sequences which are members of a particular and for those which are not.

Should a derivation be made using a derivation mechanism that removes lexical representations from the to the extent that one or more values cease to have any , then those values are dropped from the .

This could happen by means of a facet.

Conversely, should a derivation remove values then their lexical representations are dropped from the unless there is a facet value whose impact is defined to cause the otherwise-dropped to be mapped to another value instead.

There are currently no facets with such an impact. There may be in the future.

For example, '100' and '1.0E2' are two different lexical representations from the datatype which both denote the same value. The datatype system defined in this specification provides mechanisms for schema designers to control the and the corresponding set of acceptable lexical representations of those values for a datatype.

Canonical Mapping

While the datatypes defined in this specification generally have a single for each value (i.e., each value in the datatype's is denoted by a single representation in its ), this is not always the case. The example in the previous section shows two lexical representations from the datatype which denote the same value.

The canonical mapping is a prescribed subset of the inverse of a which is one-to-one and whose domain (where possible) is the entire range of the (the ). Thus a selects one for each value in the .

The canonical representation of a value in the of a datatype is the associated with that value by the datatype's .

Canonical mappings are not available for datatypes whose lexical mappings are context dependent (i.e., mappings for which the value of a depends on the context in which it occurs, or for which a character string may or may not be a valid similarly depending on its context)

Canonical representations are provided where feasible for the use of other applications; they are not required for schema processing itself. A conforming schema processor implementation is not required to implement canonical mappings.

Datatype dichotomiesDistinctions

It is useful to categorize the datatypes defined in this specification along various dimensions, forming a set of characterization dichotomiesdefining terms which can be used to characterize datatypes and the s which define them.

Atomic vs. List vs. Union Datatypes

The first distinction to be made is that between , and datatypes.

First, we distinguish , , and datatypes.

Atomic datatypes are those having values which are regardedtreated by this specification as being indivisible. Atomic datatypes are and all datatypes from it.

List datatypes are those having values each of which consists of a finite-length (possibly empty) sequence of values of an datatype (or a of datatypes), which is the of the list.

Union datatypes are (a) those whose value spaces and lexical spacesvalue spaces, lexical spaces, and lexical mappings are the union of the value spaces, lexical spaces, and lexical mappings of one or more other datatypes, which are the of the union, or (b) those derived by of another union datatype.

For example, a single token which matches Nmtoken from could be the valueis in the value space of anthe datatype ();, while a sequence of such tokens could be the value of ais in the value space of the datatype ().

Atomic Datatypes

datatypes can be either or derived. The of an datatype is a set of atomic values, which for the purposes of this specification, are not further decomposable. An datatype has a consisting of a set of atomic values which for purposes of this specification are not further decomposable. The of an datatype is a set of literalsliterals whose internal structure is specific to the datatype in question. There is one datatype (), and a number of datatypes which have as their . All other datatypes are derived either from one of the datatypes or from another datatype. No datatype may have as its .

List Datatypes

Several type systems (such as the one described in ) treat datatypes as special cases of the more general notions of aggregate or collection datatypes.

List datatypes are always from some other type; they are never . The of a datatype is a set of finite-length sequences of values. The of a datatype is a set of literals whose internal structureeach of which is a space-separated sequence of literals of the datatype of the items in the .

The or datatype that participates in the definition of a datatype is known as the itemTypeitem type of that datatype. If the is a , each of its basic members must be .

A datatype can be from an ordinary or datatype whose allows space (such as or ) or a datatype any of whose 's allows space. In such a case, regardless of the input, list items will be separated at space boundaries.Since items are separated at whitespace before the lexical representations of the items are mapped to values, no whitespace will ever occur in the of a item, even when the item type would in principle allow it. For the same reason, when every possible of a given value in the of the includes whitespace, that value can never occur as an item in any value of the datatype.

In the above example, the value of the someElement element is not a of 3; rather, it is a of 18.

When a datatype is derived fromby restricting a datatype, the following constraining facets apply:

For each of , and , the unit of length is measured in number of list items. The value of is fixed to the value collapse.

For datatypes the is composed of space-separated literals of itsthe . Hence, aAny specified when a new datatype is derived from a datatype is matched against each literal of the datatype and not against the literals of the datatype that serves as its applies to the members of the datatype's , not to the members of the of the .

<xs:simpleType name='myList'> <xs:list itemType='xs:integer'/> </xs:simpleType> <xs:simpleType name='myRestrictedList'> <xs:restriction base='myList'> <xs:pattern value='123 (\d+\s)*456'/> </xs:restriction> </xs:simpleType> <someElement xsi:type='myRestrictedList'>123 456</someElement> <someElement xsi:type='myRestrictedList'>123 987 456</someElement> <someElement xsi:type='myRestrictedList'>123 987 567 456</someElement>

The for the datatype is defined as the lexical form in which each item in the has the canonical lexical representation of its .

The of a datatype maps each value onto the space-separated concatenation of the canonical representations of all the items in the value (in order), using the of the .

Union datatypes

The and of a datatype are the union of the value spaces and lexical spaces of its . Union types may be defined in either of two ways. When a union type is by , its , , and are the ordered unions of the value spaces, lexical spaces, and lexical mappings of its . When a union type is defined by restricting another , its , , and are subsets of the value spaces, lexical spaces, and lexical mappings of its . Union datatypes are always from other datatypes; they are never . Currently, there are no datatypes.

A prototypical example of a type is the maxOccurs attribute on the element element in XML Schema itself: it is a union of nonNegativeInteger and an enumeration with the single member, the string "unbounded", as shown below.

Any number (greater than 10) of ordinary or or s datatypes can participate in a type.

The datatypes that participate in the definition of a datatype are known as the memberTypesmember types of that datatype.

The transitive membership of a is the set of its own , and the of its members, and so on. More formally, if U is a , then (a) its are in the transitive membership of U, and (b) for any datatypes T1 and T2, if T1 is in the transitive membership of U and T2 is one of the of T1, then T2 is also in the transitive membership of U.

Those members of the of a datatype U which are themselves not datatypes are the basic members of U.

If a datatype M is in the of a datatype U, but not one of U's , then a sequence of one or more datatypes necessarily exists, such that the first is one of the if U, each is one of the of its predecessor in the sequence, and M is one of the of the last in the sequence. The datatypes in this sequence are said to intervene between M and U. When U and M are given by the context, the datatypes in the sequence are referred to as the intervening unions. When M is one of the of U, the set of intervening unions is the empty set.

In a valid instance of any , the first of its members in order which accepts the instance as valid is the active member type. If the is itself a , one of its members will be its , and so on, until finally a basic (non-union) member is reached. That is the active basic member of the union.

The order in which the are specified in the definition (that is, in the case of datatypes defined in a schema document, the order of the <simpleType> children of the <union> element, or the order of the s in the memberTypes memberTypes attribute) is significant. During validation, an element or attribute's value is validated against the in the order in which they appear in the definition until a match is found. The evaluation order can be overridden with the use of xsi:type.

For example, given the definition below, the first instance of the <size> element validates correctly as an , the second and third as .

<xsd:element name='size'> <xsd:simpleType> <xsd:union> <xsd:simpleType> <xsd:restriction base='integer'/> </xsd:simpleType> <xsd:simpleType> <xsd:restriction base='string'/> </xsd:simpleType> </xsd:union> </xsd:simpleType> </xsd:element> <size>1</size> <size>large</size> <size xsi:type='xsd:string'>1</size>

The for a datatype is defined as the lexical form in which the values have the canonical lexical representation of the appropriate .

The of a datatype maps each value onto the of that value obtained using the of the first member type in whose value space it lies.

A datatype which is in this specification need not be an atomic datatype in any programming language used to implement this specification. Likewise, a datatype which is a in this specification need not be a "list" datatype in any programming language used to implement this specification. Furthermore, a datatype which is a in this specification need not be a "union" datatype in any programming language used to implement this specification.

Special vs. Primitive vs. derived datatypesOrdinary Datatypes

Next, we distinguish between and derived datatypes.

Next, we distinguish , , and (or ) datatypes.

The special datatypes are and . They are special by virtue of their position in the type hierarchy.

Primitive datatypes are those datatypes that are not and are not defined in terms of other datatypes; they exist ab initio. All datatypes have as their , but their value and lexical spaces must be given in prose; they cannot be described as restrictions of by the application of particular constraining facets.

Ordinary datatypes are all datatypes other than the and datatypes. Ordinary datatypes can be understood fully in terms of their and the properties of the datatypes from which they are .

Derived datatypes are those that are defined in terms of other datatypes.

For example, in this specification, is a datatype based on a well-defined mathematical concept that cannot beand not defined in terms of other datatypes, while a is a special case of from the more general datatype .

The simple ur-type definition is a special restriction of the ur-type definition whose name is anySimpleType in the XML Schema namespace. anySimpleType can be considered as the of all datatypes. anySimpleType is considered to have an unconstrained lexical space and a consisting of the union of the value spaces of all the datatypes and the set of all lists of all members of the value spaces of all the datatypes.

The datatypes defined by this specification fall into both the categories , , and categories. It is felt that a judiciously chosen set of datatypes will serve the widest possiblea wide audience by providing a set of convenient datatypes that can be used as is, as well as providing a rich enough base from which thea large variety of datatypes needed by schema designers can be .

In the example above, is derived from .

A datatype which is in this specification need not be a primitive datatype in any programming language used to implement this specification. Likewise, a datatype which is in this specification from some other datatype need not be a derived datatype in any programming language used to implement this specification.

As described in more detail in , each datatype be defined in terms of another datatype in one of three ways: 1) by assigning constraining facets which serve to restrict the of the datatype to a subset of that of the ; 2) by creating a datatype whose consists of finite-length sequences of values of its ; or 3) by creating a datatype whose consists of the union of the value spaces of its .

Derived by restrictionFacet-based Restriction

A datatype is said to be derived by restriction from another datatype when values for zero or more constraining facets are specified that serve to constrain its and/or its to a subset of those of its .

A datatype is defined by facet-based restriction of another datatype (its ), when values for zero or more constraining facets are specified that serve to constrain its and/or its to a subset of those of the . The of a must be a or datatype.

Every datatype that is derived by is defined in terms of an existing datatype, referred to as its base type. Base types can be either or derived.

Derived by listConstruction by List

A datatype can be from another datatype (its ) by creating a that consists of a finite-length sequence of values of its . Datatypes so have as their . Note that since the and of any datatype are necessarily subsets of the and of , any datatype as a is a of its base type.

Derived by unionConstruction by Union

One datatype can be from one or more datatypes by ingunioning their value spaceslexical mappings and, consequently, their value spaces and lexical spaceslexical spaces. Datatypes so also have as their . Note that since the and of any datatype are necessarily subsets of the and of , any datatype as a is a of its base type.

Definition, Derivation, Restriction, and Construction

Definition, derivation, restriction, and construction are conceptually distinct, although in practice they are frequently performed by the same mechanisms.

By definition is meant the explicit identification of the relevant properties of a datatype, in particular its , , and .

The properties of the and datatypes are defined by this specification. A is present for each of these datatypes in every valid schema; it serves as a representation of the datatype, but by itself it does not capture all the relevant information and does not suffice (without knowledge of this specification) to define the datatype.

For all other datatypes, a does suffice. The properties of an datatype can be inferred from the datatype's and the properties of the , if any, and if any. All datatypes can be defined in this way.

By derivation is meant the relation of a datatype to its , or to the of its , and so on.

Every datatype is associated with another datatype, its base type. Base types can be , , or .

A datatype T is immediately derived from another datatype X if and only if X is the of T.

More generally, A datatype R is derived from another datatype B if and only if one of the following is true:

B is the of R.

There is some datatype X such that X is the of R, and X is derived from B.

It is a consequence of these definitions that every datatype other than is derived from .

Since each datatype has exactly one , and every datatype is derived directly or indirectly from , it follows that the relation arranges all simple types into a tree structure, which is conventionally referred to as the derivation hierarchy.

By restriction is meant the definition of a datatype whose and are subsets of those of its .

Formally, A datatype R is a restriction of another datatype B when

the of R is a subset of the of B, and

the of R is a subset of the of B.

Note that all three forms of datatype construction produce restrictions of the : does so by means of constraining facets, while construction by or does so because those constructions take as the . It follows that all datatypes are restrictions of . This specification provides no means by which a datatype may be defined so as to have a larger or than its .

By construction is meant the creation of a datatype by defining it in terms of another.

All datatypes are defined in terms of, or constructed from, other datatypes, either by restricting the or of a using zero or more constraining facets or by specifying the new datatype as a of items of some , or by defining it as a of some specified sequence of . These three forms of construction are often called , construction by , and construction by , respectively. Datatypes so constructed may be understood fully (for purposes of a type system) in terms of (a) the properties of the datatype(s) from which they are constructed, and (b) their . This distinguishes datatypes from the and datatypes, which can be understood only in the light of documentation (namely, their descriptions elsewhere in this specification). All datatypes are , and all datatypes are .

Built-in vs. User-DerivedDefined Datatypes

Built-in datatypes are those which are defined in this specification, and; they can be either , , or datatypes .

User-derived datatypes are those derived datatypes that are defined by individual schema designers.

User-defined datatypes are those datatypes that are defined by individual schema designers.

Conceptually there is no difference between the datatypes included in this specification and the datatypes which will be created by individual schema designers. The datatypes are those which are believed to be so common that if they were not defined in this specification many schema designers would end up reinventingreinventing them. Furthermore, including these datatypes in this specification serves to demonstrate the mechanics and utility of the datatype generation facilities of this specification.

A datatype which is in this specification need not be a built-inbuilt-in datatype in any programming language used to implement this specification. Likewise, a datatype which is in this specification need not be a user-deriveduser-defined datatype in any programming language used to implement this specification.

Built-in datatypesBuilt-in Datatypes and Their Definitions

Each built-in datatype in this specification (both and derived) can be uniquely addressed via a URI Reference constructed as follows:

the base URI is the URI of the XML Schema namespace

the fragment identifier is the name of the datatype

For example, to address the datatype, the URI is:

http://www.w3.org/2001/XMLSchema#int

Additionally, each facet definition element can be uniquely addressed via a URI constructed as follows:

the base URI is the URI of the XML Schema namespace

the fragment identifier is the name of the facet

For example, to address the maxInclusive facet, the URI is:

http://www.w3.org/2001/XMLSchema#maxInclusive

Additionally, each facet usage in a built-in datatype definition can be uniquely addressed via a URI constructed as follows:

the base URI is the URI of the XML Schema namespace

the fragment identifier is the name of the datatype, followed by a period (.) followed by the name of the facet

For example, to address the usage of the maxInclusive facet in the definition of int, the URI is:

http://www.w3.org/2001/XMLSchema#int.maxInclusive

Namespace considerations

The datatypes defined by this specification are designed to be used with the XML Schema definition language as well as other XML specifications. To facilitate usage within the XML Schema definition language, the datatypes in this specification have the namespace name:

http://www.w3.org/2001/XMLSchema

To facilitate usage in specifications other than the XML Schema definition language, such as those that do not want to know anything about aspects of the XML Schema definition language other than the datatypes, each non- datatype is also defined in the namespace whose URI is:

http://www.w3.org/2001/XMLSchema-datatypes

This applies to both and derived datatypes.

The use of the XMLSchema-datatypes namespace and the definitions therein are deprecated as of XML Schema 1.1.

Each datatype is also associated with a unique namespace. However, datatypes do not come from the namespace defined by this specification; rather, they come from the namespace of the schema in which they are defined (see XML Representation of Schemas in ).

Special Built-in Datatypes

The two datatypes at the root of the hierarchy of simple types are and .

anySimpleType

The definition of is a special of . anySimpleType has an unconstrained , a consisting of the union of the value spaces of all the datatypes and the set of all lists of all members of the value spaces of all the datatypes.

For further details of and its representation as a , see .

Value space

The of is the union of the value spaces of all the datatypes defined here, and of all sets of lists formed from the members of the datatypes.

Lexical mapping

The of is the set of all finite-length sequences of characters (as defined in ) that the Char production from . This is equivalent to the union of the lexical spaces of all and all possible datatypes.

It is implementation-defined whether an implementation of this specification supports the Char production from , or that from , or both. See .

The of is the union of the lexical mappings of all datatypes and all list datatypes. It will be noted that this mapping is not a function: a given may map to one value or to several values of different datatypes, and it may be indeterminate which value is to be preferred in a particular context. When the datatypes defined here are used in the context of , the xsi:type attribute defined by that specification in section xsi:type can be used to indicate which value a which is the content of an element should map to. In other contexts, other rules (such as type coercion rules) may be employed to determine which value is to be used.

Facets

When a new datatype is defined by , must not be used as the . So no constraining facets are directly applicable to .

anyAtomicType

is a special of . The value and lexical spaces of anyAtomicType are the unions of the value and lexical spaces of all the datatypes, and anyAtomicType is their .

For further details of and its representation as a , see .

Value space

The of is the union of the value spaces of all the datatypes defined here.

Lexical mapping

The of is the set of all finite-length sequences of characters (as defined in ) that the Char production from . This is equivalent to the union of the lexical spaces of all datatypes.

It is implementation-defined whether an implementation of this specification supports the Char production from , or that from , or both. See .

The of is the union of the lexical mappings of all datatypes. It will be noted that this mapping is not a function: a given may map to one value or to several values of different datatypes, and it may be indeterminate which value is to be preferred in a particular context. When the datatypes defined here are used in the context of , the xsi:type attribute defined by that specification in section xsi:type can be used to indicate which value a which is the content of an element should map to. In other contexts, other rules (such as type coercion rules) may be employed to determine which value is to be used.

Facets

When a new datatype is defined by , must not be used as the . So no constraining facets are directly applicable to .

Primitive Datatypes

The datatypes defined by this specification are described below. For each datatype, the is described; and the areis defined, using an extended Backus Naur Format grammar (and in most cases also a regular expression using the regular expression language of ); constraining facets which apply to the datatype are listed; and any datatypes from this datatype are specified.

Primitive datatypes can only be added by revisions to this specification.

string

The string datatype represents character strings in XML. The of string is the set of finite-length sequences of characters (as defined in ) that the Char production from . A character is an atomic unit of communication; it is not further specified except to note that every character has a corresponding Universal Character Set code point, which is an integer.

Many human languages have writing systems that require child elements for control of aspects such as bidirectional formatting or ruby annotation (see and Section 8.2.4 Overriding the bidirectional algorithm: the BDO element of ). Thus, string, as a simple type that can contain only characters but not child elements, is often not suitable for representing text. In such situations, a complex type that allows mixed content should be considered. For more information, see Section 5.5 Any Element, Any Attribute of .

Value Space

The of is the set of finite-length sequences of characters (as defined in ) that the Char production from . A character is an atomic unit of communication; it is not further specified except to note that every character has a corresponding Universal Character Set (UCS) code point, which is an integer.

It is implementation-defined whether an implementation of this specification supports the Char production from , or that from , or both. See .

Equality for is identity. No order is prescribed.

As noted in , the fact that this specification does not specify an order relation for does not preclude other applications from treating strings as being ordered.

Lexical Mapping

The of is the set of finite-length sequences of characters (as defined in ) that the Char production from . Lexical Space stringRep Char* (as defined in )

It is implementation-defined whether an implementation of this specification supports the Char production from , or that from , or both. See .

The for is , and the is ; each is a subset of the identity function.

Derived datatypes boolean

boolean has the required to support the mathematical concept of binary-valued logic: {true, false}represents the values of two-valued logic.

Value Space

has the of two-valued logic: {true, false}.

Lexical representation

An instance of a datatype that is defined as can have the following legal literals {true, false, 1, 0}.

Canonical representation

The for boolean is the set of literals {true, false}.

Lexical Mapping

's lexical space is a set of four literals: Lexical Space booleanRep true | false | 1 | 0

The for is ; the is .

decimal

decimal represents a subset of the real numbers, which can be represented by decimal numerals. The of decimal is the set of numbers that can be obtained by multiplyingdividing an integer by a non-positivenegative power of ten, i.e., expressible as i × 10^-ni / 10ⁿ where i and n are integers and n >= 0n ≥ 0. Precision is not reflected in this value space; the number 2.0 is not distinct from the number 2.00. (The datatype may be used for values in which precision is significant.) The order relation on decimal is the order relation on real numbers, restricted to this subset.

All processors support decimal numbers with a minimum of 18 decimal digits (i.e., with a of 18). However, processors set an application-defined limit on the maximum number of decimal digits they are prepared to support, in which case that application-defined maximum number be clearly documented.

Lexical representationMapping

decimal has a lexical representation consisting of a finite-length sequence of decimal digits (#x30–#x39) separated by a period as a decimal indicator. An optional leading sign is allowed. If the sign is omitted, "+" is assumed. Leading and trailing zeroes are optional. If the fractional part is zero, the period and following zero(es) can be omitted. For example: -1.23, 12678967.543233, +100000.00, 210.

The Lexical Representation decimalLexicalRep |

The lexical space of decimal is the set of lexical representations which match the grammar given above, or (equivalently) the regular expression -?(([0-9]+(.[0-9]*)?)|(.[0-9]+)).

The mapping from lexical representations to values is the usual one for decimal numerals; it is given formally in:

The mapping from lexical representations to values is the usual one for decimal numerals; it is given formally in .

The mapping from values to canonical representations is given formally in .

Canonical representation

The for decimal is defined by prohibiting certain options from the . Specifically, the preceding optional "+" sign is prohibited. The decimal point is required. Leading and trailing zeroes are prohibited subject to the following: there must be at least one digit to the right and to the left of the decimal point which may be a zero.

The mapping from values to canonical representations is given formally in:

The mapping from values to canonical representations is given formally in .

Derived datatypesDatatypes based on decimal precisionDecimal

The precisionDecimal datatype represents the numeric value and (arithmetic) precision of decimal numbers which retain precision; it also includes special values for positive and negative infinity and not a number, and it differentiates between positive zero and negative zero. This datatype is introduced to provide a variant of decimal that closely corresponds to the floating-point decimal datatypes described by the expected forthcoming revision of IEEE/ANSI 754. Precision of values is retained, and the special values (two zeroes, infinities, and not-a-number) are included.

Precision is sometimes given in absolute, sometimes in relative terms. The arithmetic precision of a value is expressed in absolute quantitative terms, by indicating how many digits to the right of the decimal point are significant. 5 has an arithmetic precision of 0, and 5.01 an arithmetic precision of 2.

See the conformance note in , which applies to this datatype.

Value Space Properties of Values numericalValue a decimal number, positiveInfinity, negativeInfinity or notANumber arithmeticPrecision an integer or absent; absent if and only if is a . sign positive, negative, or absent; must be positive if is positive or positiveInfinity, must be negative if is negative or negativeInfinity, must be absent if and only if is notANumber

The property is redundant except when is zero; in other cases, the value is fully determined by the value.

As explained below, the lexical representation of the value object whose is notANumber is NaN. Accordingly, in English text we use NaN to refer to that value. Similarly we use INF and −INF to refer to the two value objects whose is positiveInfinity and negativeInfinity. These three value objects are also informally called not-a-number, positive infinity, and negative infinity. The latter two together are called the infinities.

Equality and order for are defined as follows:

Two numerical values are ordered (or equal) as their values are ordered (or equal). (This means that two zeroes with different s are equal; negative zeroes are not ordered less than positive zeroes.)

INF is equal only to itself, and is greater than −INF and all numerical values.

−INF is equal only to itself, and is less than INF and all numerical values.

NaN is incomparable with all values, including itself.

Lexical Mapping

's lexical space is the set of all decimal numerals with or without a decimal point, numerals in scientific (exponential) notation, and the character strings INF, +INF, -INF, and NaN. Lexical Space pDecimalRep | | |

The for is . The is .

Facets float

The float datatype is patterned after the IEEE single-precision 32-bit floating point datatype with the minor exception noted below. The basic of float consists of the values m × 2^e, where m is an integer whose absolute value is less than 2^24, and e is an integer between -149 and 104, inclusive. In addition to the basic described above, the of float also contains the following three special values: positive and negative infinity and not-a-number (NaN). The on float is: x < y iff y - x is positive for x and y in the value space. Positive infinity is greater than all other non-NaN values. NaN equals itself but is incomparable with (neither greater than nor less than) any other value in the . Floating point numbers are certain subsets of the rational numbers, and are often used to approximate arbitrary real numbers.

"Equality" in this Recommendation is defined to be "identity" (i.e., values that are identical in the are equal and vice versa). Identity must be used for the few operations that are defined in this Recommendation. Applications using any of the datatypes defined in this Recommendation may use different definitions of equality for computational purposes; -based computation systems are examples. Nothing in this Recommendation should be construed as requiring that such applications use identity as their equality relationship when computing.

Any value incomparable with the value used for the four bounding facets (, , , and ) will be excluded from the resulting restricted . In particular, when "NaN" is used as a facet value for a bounding facet, since no other float values are comparable with it, the result is a either having NaN as its only member (the inclusive cases) or that is empty (the exclusive cases). If any other value is used for a bounding facet, NaN will be excluded from the resulting restricted ; to add NaN back in requires union with the NaN-only space.

This datatype differs from that of in that there is only one NaN and only one zero. This makes the equality and ordering of values in the data space differ from that of only in that for schema purposes NaN = NaN.

A in the representing a decimal number d maps to the normalized value in the of float that is closest to d in the sense defined by ; if d is exactly halfway between two such values then the even value is chosen.

Value Space

The of contains the non-zero numbers m × 2^e , where m is an integer whose absolute value is less than 2²⁴, and e is an integer between −149 and 104, inclusive. In addition to these values, the of also contains the following special values: positiveZero, negativeZero, positiveInfinity, negativeInfinity, and notANumber.

As explained below, the of the value notANumber is NaN. Accordingly, in English text we generally use NaN to refer to that value. Similarly, we use INF and −INF to refer to the two values positiveInfinity and negativeInfinity, and 0 and −0 to refer to positiveZero and negativeZero.

Equality and order for are defined as follows:

Equality is identity, except that 0 = −0 (although they are not identical) and NaN ≠ NaN (although NaN is of course identical to itself).

0 and −0 are thus distinct for purposes of enumerations and identity constraints, but equal for purposes of minimum and maximum values.

For the basic values, the order relation on float is the order relation for rational numbers. INF is greater than all other non-NaN values; −INF is less than all other non-NaN values. NaN is with any value in the including itself. 0 and −0 are greater than all the negative numbers and less than all the positive numbers.

Any value with the value used for the four bounding facets (, , , and ) will be excluded from the resulting restricted . In particular, when NaN is used as a facet value for a bounding facet, since no values are comparable with it, the result is a that is empty. If any other value is used for a bounding facet, NaN will be excluded from the resulting restricted ; to add NaN back in requires union with the NaN-only space (which may be derived by an enumeration).

The Schema 1.0 version of this datatype did not differentiate between 0 and −0 and NaN was equal to itself. The changes were made to make the datatype more closely mirror .

Lexical representation

float values have a lexical representation consisting of a mantissa followed, optionally, by the character E or e, followed by an exponent. The exponent be an . The mantissa must be a number. The representations for exponent and mantissa must follow the lexical rules for and . If the E or e and the following exponent are omitted, an exponent value of 0 is assumed.

The special values positive and negative infinity and not-a-number have lexical representations INF, -INF and NaN, respectively. Lexical representations for zero may take a positive or negative sign.

For example, -1E4, 1267.43233E12, 12.78e-2, 12 , -0, 0 and INF are all legal literals for float.

Canonical representation

The for float is defined by prohibiting certain options from the . Specifically, the exponent must be indicated by "E". Leading zeroes and the preceding optional "+" sign are prohibited in the exponent. If the exponent is zero, it must be indicated by "E0". For the mantissa, the preceding optional "+" sign is prohibited and the decimal point is required. Leading and trailing zeroes are prohibited subject to the following: number representations must be normalized such that there is a single digit which is non-zero to the left of the decimal point and at least a single digit to the right of the decimal point unless the value being represented is zero. The for zero is 0.0E0.

Lexical Mapping

The of is the set of all decimal numerals with or without a decimal point, numerals in scientific (exponential) notation, and the literals INF, -INF, and NaN Lexical Space floatRep | | | The production is equivalent to this regular expression: (-|+)?(([0-9]+(.[0-9]*)?)|(.[0-9]+))((e|E)(-|+)?[0-9]+)?|-?INF|NaN

The datatype is designed to implement for schema processing the single-precision floating-point datatype of . That specification does not specify specific lexical representations, but does prescribe requirements on any used. Any that maps the just described onto the , satisfies the requirements of , and correctly handles the special values ( literals), satisfies the conformance requirements of this specification.

Since IEEE allows some variation in rounding of values, processors conforming to this specification may exhibit some variation in their lexical mappings.

The is provided as an example of a simple algorithm that yields a conformant mapping, and that provides the most accurate rounding possible—and is thus useful for insuring inter-implementation reproducibility and inter-implementation round-tripping. The simple rounding algorithm used in may be more efficiently implemented using the algorithms of .

The Schema 1.0 version of this datatype did not permit rounding algorithms whose results differed from .

The is provided as an example of a mapping that does not produce unnecessarily long canonical representations. Other algorithms which do not yield identical results for mapping from float values to character strings are permitted by .

double

The double datatype is patterned after the IEEE double-precision 64-bit floating point datatype with the minor exception noted below. The basic of double consists of the values m × 2^e, where m is an integer whose absolute value is less than 2⁵³, and e is an integer between -1075 and 970, inclusive. In addition to the basic described above, the of double also contains the following three special values: positive and negative infinity and not-a-number (NaN). The on double is: x < y iff y - x is positive for x and y in the value space. Positive infinity is greater than all other non-NaN values. NaN equals itself but is incomparable with (neither greater than nor less than) any other value in the . Floating point numbers are certain subsets of the rational numbers, and are often used to approximate arbitrary real numbers.

The only significant differences between float and double are the three defining constants 53 (vs 24), −1074 (vs −149), and 971 (vs 104).

Any value incomparable with the value used for the four bounding facets (, , , and ) will be excluded from the resulting restricted . In particular, when "NaN" is used as a facet value for a bounding facet, since no other double values are comparable with it, the result is a either having NaN as its only member (the inclusive cases) or that is empty (the exclusive cases). If any other value is used for a bounding facet, NaN will be excluded from the resulting restricted ; to add NaN back in requires union with the NaN-only space.

A in the representing a decimal number d maps to the normalized value in the of double that is closest to d; if d is exactly halfway between two such values then the even value is chosen. This is the best approximation of d (, ), which is more accurate than the mapping required by .

Value Space

The of contains the non-zero numbers m × 2^e , where m is an integer whose absolute value is less than 2⁵³, and e is an integer between −1074 and 971, inclusive. In addition to these values, the of also contains the following special values: positiveZero, negativeZero, positiveInfinity, negativeInfinity, and notANumber.

Equality and order for are defined as follows:

Equality is identity, except that 0 = −0 (although they are not identical) and NaN ≠ NaN (although NaN is of course identical to itself).

0 and −0 are thus distinct for purposes of enumerations and identity constraints, but equal for purposes of minimum and maximum values.

For the basic values, the order relation on double is the order relation for rational numbers. INF is greater than all other non-NaN values; −INF is less than all other non-NaN values. NaN is with any value in the including itself. 0 and −0 are greater than all the negative numbers and less than all the positive numbers.

The Schema 1.0 version of this datatype did not differentiate between 0 and −0 and NaN was equal to itself. The changes were made to make the datatype more closely mirror .

Lexical representation

double values have a lexical representation consisting of a mantissa followed, optionally, by the character "E" or "e", followed by an exponent. The exponent be an integer. The mantissa must be a number. The representations for exponent and mantissa must follow the lexical rules for and . If the E or e and the following exponent are omitted, an exponent value of 0 is assumed.

For example, -1E4, 1267.43233E12, 12.78e-2, 12 , -0, 0 and INF are all legal literals for double.

Canonical representation

The for double is defined by prohibiting certain options from the . Specifically, the exponent must be indicated by "E". Leading zeroes and the preceding optional "+" sign are prohibited in the exponent. If the exponent is zero, it must be indicated by "E0". For the mantissa, the preceding optional "+" sign is prohibited and the decimal point is required. Leading and trailing zeroes are prohibited subject to the following: number representations must be normalized such that there is a single digit which is non-zero to the left of the decimal point and at least a single digit to the right of the decimal point unless the value being represented is zero. The for zero is 0.0E0.

Lexical Mapping

The of is the set of all decimal numerals with or without a decimal point, numerals in scientific (exponential) notation, and the literals INF, -INF, and NaN Lexical Space doubleRep | | | The production is equivalent to this regular expression: (-|+)?(([0-9]+(.[0-9]*)?)|(.[0-9]+))((e|E)(-|+)?[0-9]+)?|-?INF|NaN

The datatype is designed to implement for schema processing the double-precision floating-point datatype of . That specification does not specify specific lexical representations, but does prescribe requirements on any used. Any that maps the just described onto the , satisfies the requirements of , and correctly handles the special values ( literals), satisfies the conformance requirements of this specification.

Since IEEE allows some variation in rounding of values, processors conforming to this specification may exhibit some variation in their lexical mappings.

The Schema 1.0 version of this datatype did not permit rounding algorithms whose results differed from .

duration

duration represents a duration of time. The of duration is a six-dimensional space where the coordinates designate the Gregorian year, month, day, hour, minute, and second components defined in § 5.5.3.2 of , respectively. These components are ordered in their significance by their order of appearance i.e. as year, month, day, hour, minute, and second.

All processors support year values with a minimum of 4 digits (i.e., YYYY) and a minimum fractional second precision of milliseconds or three decimal digits (i.e. s.sss). However, processors set an application-defined limit on the maximum number of digits they are prepared to support in these two cases, in which case that application-defined maximum number be clearly documented.

duration is a datatype that represents durations of time. The concept of duration being captured is drawn from those of , specifically durations without fixed endpoints. For example, 15 days (whose most common lexical representation in is P15D) is a value; 15 days beginning 12 July 1995 and 15 days ending 12 July 1995 are not. can provide addition and subtraction operations between values and between / value pairs, and can be the result of subtracting values. However, only addition to is required for XML Schema processing and is defined in the function .

Value Space

Duration values can be modelled as two-property tuples. Each value consists of an integer number of months and a decimal number of seconds. The value must not be negative if the value is positive and must not be positive if the is negative. Properties of Values months seconds a value; must not be negative if is positive, and must not be positive if is negative. is partially ordered. Equality of is defined in terms of equality of ; order for is defined in terms of the order of . Specifically, the equality or order of two values is determined by adding each in the pair to each of the following four values:

1696-09-01T00:00:00Z

1697-02-01T00:00:00Z

1903-03-01T00:00:00Z

1903-07-01T00:00:00Z

If all four resulting value pairs are ordered the same way (less than, equal, or greater than), then the original pair of values is ordered the same way; otherwise the original pair is .

These four values are chosen so as to maximize the possible differences in results that could occur, such as the difference when adding P1M and P30D: 1697-02-01T00:00:00Z + P1M < 1697-02-01T00:00:00Z + P30D , but 1903-03-01T00:00:00Z + P1M > 1903-03-01T00:00:00Z + P30D , so that P1M <> P30D . If two values are ordered the same way when added to each of these four values, they will retain the same order when added to any other values. Therefore, two values are incomparable if and only if they can ever result in different orders when added to any value.

Under the definition just given, two values are equal if and only if they are identical.

Two totally ordered datatypes ( and ) are derived from in .

There are many ways to implement , some of which do not base the implementation on the two-component model. This specification does not prescribe any particular implementation, as long as the visible results are isomorphic to those described herein.

See the conformance notes in , which apply to this datatype.

Lexical representation

The lexical representation for duration is the extended format PnYn MnDTnH nMnS, where nY represents the number of years, nM the number of months, nD the number of days, 'T' is the date/time separator, nH the number of hours, nM the number of minutes and nS the number of seconds. The number of seconds can include decimal digits to arbitrary precision.

The values of the Year, Month, Day, Hour and Minutes components are not restricted but allow an arbitrary integer. Similarly, the value of the Seconds component allows an arbitrary decimal. Thus, the lexical representation of duration does not follow the alternative format of § 5.5.3.2.1 of .

An optional preceding minus sign ('-') is allowed, to indicate a negative duration. If the sign is omitted a positive duration is indicated. See also .

For example, to indicate a duration of 1 year, 2 months, 3 days, 10 hours, and 30 minutes, one would write: P1Y2M3DT10H30M. One could also indicate a duration of minus 120 days as: -P120D.

Reduced precision and truncated representations of this format are allowed provided they conform to the following:

If the number of years, months, days, hours, minutes, or seconds in any expression equals zero, the number and its corresponding designator be omitted. However, at least one number and its designator be present.

The seconds part have a decimal fraction.

The designator 'T' shall be absent if all of the time items are absent. The designator 'P' must always be present.

For example, P1347Y, P1347M and P1Y2MT2H are all allowed; P0Y1347M and P0Y1347M0D are allowed. P-1347M is not allowed although -P1347M is allowed. P1Y2MT is not allowed.

Lexical Space

The lexical representations of are more or less based on the pattern: PnYnMnDTnHnMnS

More precisely, the of is the set of character strings that satisfy as defined by the following productions: Lexical Representation Fragments duYearFrag Y duMonthFrag M duDayFrag D duHourFrag H duMinuteFrag M duSecondFrag ( | ) S duYearMonthFrag ( ?) | duTimeFrag T (( ? ?) | ( ?) | ) duDayTimeFrag ( ?) | Lexical Representation durationLexicalRep -? P (( ?) | )

Thus, a consists of one or more of a , , , , , and/or , in order, with letters P and T (and perhaps a -) where appropriate.

The language accepted by the production is the set of strings which satisfy all of the following three regular expressions:

The expression -?P([0-9]+Y)?([0-9]+M)?([0-9]+D)?(T([0-9]+H)?([0-9]+M)?((([0-9]+(.[0-9]*)?)|(.[0-9]+))S)?)? matches only strings in which the fields occur in the proper order.

The expression .*[YMDHS].* matches only strings in which at least one field occurs.

The expression .*[^T] matches only strings in which T is not the final character, so that if T appears, something follows it. The first rule ensures that what follows T will be an hour, minute, or second field.

The intersection of these three regular expressions is equivalent to the following (after removal of the white space inserted here for legibility):

-?P(((([0-9]+Y([0-9]+M)?)|

      (       ([0-9]+M) ) )(([0-9]+D(T(([0-9]+H([0-9]+M)?([0-9]+(\.[0-9]+)?S)?)|

                                       (       ([0-9]+M) ([0-9]+(\.[0-9]+)?S)?)|

                                       (                 ([0-9]+(\.[0-9]+)?S) ) ))?)|

                            (       (T(([0-9]+H([0-9]+M)?([0-9]+(\.[0-9]+)?S)?)|

                                       (       ([0-9]+M) ([0-9]+(\.[0-9]+)?S)?)|

                                       (                 ([0-9]+(\.[0-9]+)?S) ) )) ) )?)|

    (                      (([0-9]+D(T(([0-9]+H([0-9]+M)?([0-9]+(\.[0-9]+)?S)?)|

                                       (       ([0-9]+M) ([0-9]+(\.[0-9]+)?S)?)|

                                       (                 ([0-9]+(\.[0-9]+)?S) ) ))?)|

                            (       (T(([0-9]+H([0-9]+M)?([0-9]+(\.[0-9]+)?S)?)|

                                       (       ([0-9]+M) ([0-9]+(\.[0-9]+)?S)?)|

                                       (                 ([0-9]+(\.[0-9]+)?S) ) )) ) ) ) )

The for is .

The canonical mapping for is .

Order relation on duration

In general, the on duration is a partial order since there is no determinate relationship between certain durations such as one month (P1M) and 30 days (P30D). The of two duration values x and y is x < y iff s+x < s+y for each qualified s in the list below. These values for s cause the greatest deviations in the addition of dateTimes and durations. Addition of durations to time instants is defined in .

1696-09-01T00:00:00Z

1697-02-01T00:00:00Z

1903-03-01T00:00:00Z

1903-07-01T00:00:00Z

The following table shows the strongest relationship that can be determined between example durations. The symbol <> means that the order relation is indeterminate. Note that because of leap-seconds, a seconds field can vary from 59 to 60. However, because of the way that addition is defined in , they are still totally ordered.

	Relation
P1Y	> P364D	<> P365D				<> P366D	< P367D
P1M	> P27D	<> P28D	<> P29D		<> P30D	<> P31D	< P32D
P5M	> P149D	<> P150D	<> P151D	<> P152D		<> P153D	< P154D

Implementations are free to optimize the computation of the ordering relationship. For example, the following table can be used to compare durations of a small number of months against days.

	Months	1	2	3	4	5	6	7	8	9	10	11	12	13	...
Days	Minimum	28	59	89	120	150	181	212	242	273	303	334	365	393	...
Days	Maximum	31	62	92	123	153	184	215	245	276	306	337	366	397	...

Totally ordered durations

Certain derived datatypes of durations can be guaranteed have a total order. For this, they must have fields from only one row in the list below and the time zone must either be required or prohibited.

year, month

day, hour, minute, second

For example, a datatype could be defined to correspond to the datatype Year-Month interval that required a four digit year field and a two digit month field but required all other fields to be unspecified. This datatype could be defined as below and would have a total order.

<simpleType name='SQL-Year-Month-Interval'> <restriction base='duration'> <pattern value='P\p{Nd}{4}Y\p{Nd}{2}M'/> </restriction> </simpleType> Related Datatypes dateTime

dateTime values may be viewed as objects with integer-valued year, month, day, hour and minute properties, a decimal-valued second property, and a boolean timezoned property. Each such object also has one decimal-valued method or computed property, timeOnTimeline, whose value is always a decimal number; the values are dimensioned in seconds, the integer 0 is 0001-01-01T00:00:00 and the value of timeOnTimeline for other dateTime values is computed using the Gregorian algorithm as modified for leap-seconds. The timeOnTimeline values form two related "timelines", one for timezoned values and one for non-timezoned values. Each timeline is a copy of the of , with integers given units of seconds.

represents instants of time, optionally marked with a particular timezone. Values representing the same instant but having different timezones are equal but not identical.

The of dateTime is closely related to the dates and times described in ISO 8601. For clarity, the text above specifies a particular origin point for the timeline. It should be noted, however, that schema processors need not expose the timeOnTimeline value to schema users, and there is no requirement that a timeline-based implementation use the particular origin described here in its internal representation. Other interpretations of the which lead to the same results (i.e., are isomorphic) are of course acceptable.

All timezoned times are Coordinated Universal Time (, sometimes called "Greenwich Mean Time"). Other timezones indicated in lexical representations are converted to during conversion of literals to values. "Local" or untimezoned times are presumed to be the time in the timezone of some unspecified locality as prescribed by the appropriate legal authority; currently there are no legally prescribed timezones which are durations whose magnitude is greater than 14 hours. The value of each numeric-valued property (other than timeOnTimeline) is limited to the maximum value within the interval determined by the next-higher property. For example, the day value can never be 32, and cannot even be 29 for month 02 and year 2002 (February 2002).

The date and time datatypes described in this recommendation were inspired by . '0001' is the lexical representation of the year 1 of the Common Era (1 CE, sometimes written "AD 1" or "1 AD"). There is no year 0, and '0000' is not a valid lexical representation. '-0001' is the lexical representation of the year 1 Before Common Era (1 BCE, sometimes written "1 BC").

Those using this (1.0) version of this Recommendation to represent negative years should be aware that the interpretation of lexical representations beginning with a '-' is likely to change in subsequent versions.

makes no mention of the year 0; in the form '0000' was disallowed and this recommendation disallows it as well. However, , which became available just as we were completing version 1.0, allows the form '0000', representing the year 1 BCE. A number of external commentators have also suggested that '0000' be allowed, as the lexical representation for 1 BCE, which is the normal usage in astronomical contexts. It is the intention of the XML Schema Working Group to allow '0000' as a lexical representation in the dateTime, date, gYear, and gYearMonth datatypes in a subsequent version of this Recommendation. '0000' will be the lexical representation of 1 BCE (which is a leap year), '-0001' will become the lexical representation of 2 BCE (not 1 BCE as in this (1.0) version), '-0002' of 3 BCE, etc.

See the conformance note in which applies to this datatype as well.

Value Space

uses the , with no properties except permitted to be absent. The property remains .

In version 1.0 of this specification, the property was not permitted to have the value zero. The year 1 BCE was represented by a value of −1, 2 BCE by −2, and so forth. In this version of this specification, two changes are made in order to agree with existing usage. First, is permitted to have the value zero. Second, the interpretation of values is changed accordingly: a value of zero represents 1 BCE, −1 represents 2 BCE, etc. This representation simplifies interval arithmetic and leap-year calculation for dates before the common era.

Note that 1 BCE, 5 BCE, and so on (years 0000, -0004, etc. in the lexical representation defined here) are leap years in the proleptic Gregorian calendar used for the date/time datatypes defined here. Version 1.0 of this specification was unclear about the treatment of leap years before the common era; caution should be used if existing schemas or data specify dates of 29 February for any years before the common era. With that possible exception, schemas and data valid under the old interpretation remain valid under the new.

Day-of-month Values

The value must be no more than 30 if is one of 4, 6, 9, or 11; no more than 28 if is 2 and is not divisible 4, or is divisible by 100 but not by 400; and no more than 29 if is 2 and is divisible by 400, or by 4 but not by 100.

See the conformance note in which applies to the and values of this datatype.

Equality and order are as prescribed in . values are ordered by their value.

Although and other types related to dates and times have only a partial order, it is possible for datatypes derived from to have total orders, if they are restricted (e.g. using the facet) to the subset of values with, or the subset of values without, timezones. Similar restrictions on other date- and time-related types will similarly produce totally ordered subtypes. Note, however, that such restrictions do not affect the value shown, for a given , in the facet.

Order and equality are essentially the same for in this version of this specification as they were in version 1.0. However, since values now distinguish timezones, equal values with different s are not identical, and values with extreme s may no longer be equal to any value with a smaller .

Lexical representation

The of dateTime consists of finite-length sequences of characters of the form: '-'? yyyy '-' mm '-' dd 'T' hh ':' mm ':' ss ('.' s+)? (zzzzzz)?, where

'-'? yyyy is a four-or-more digit optionally negative-signed numeral that represents the year; if more than four digits, leading zeroes are prohibited, and '0000' is prohibited (see the Note above ; also note that a plus sign is not permitted);

the remaining '-'s are separators between parts of the date portion;

the first mm is a two-digit numeral that represents the month;

dd is a two-digit numeral that represents the day;

'T' is a separator indicating that time-of-day follows;

hh is a two-digit numeral that represents the hour; '24' is permitted if the minutes and seconds represented are zero, and the dateTime value so represented is the first instant of the following day (the hour property of a dateTime object in the cannot have a value greater than 23);

':' is a separator between parts of the time-of-day portion;

the second mm is a two-digit numeral that represents the minute;

ss is a two-integer-digit numeral that represents the whole seconds;

'.' s+ (if present) represents the fractional seconds;

zzzzzz (if present) represents the timezone (as described below).

For example, 2002-10-10T12:00:00-05:00 (noon on 10 October 2002, Central Daylight Savings Time as well as Eastern Standard Time in the U.S.) is 2002-10-10T17:00:00Z, five hours later than 2002-10-10T12:00:00Z.

For further guidance on arithmetic with dateTimes and durations, see .

Canonical representation

Except for trailing fractional zero digits in the seconds representation, '24:00:00' time representations, and timezone (for timezoned values), the mapping from literals to values is one-to-one. Where there is more than one possible representation, the is as follows:

The 2-digit numeral representing the hour must not be '24';

The fractional second string, if present, must not end in '0';

for timezoned values, the timezone must be represented with 'Z' (All timezoned dateTime values are .).

Lexical Mappings

The lexical representations for are as follows: Lexical Space dateTimeLexicalRep - - T (( : : ) | ) ? Day-of-month Representations

Within a , a must not begin with the digit 3 or be 29 unless the value to which it would map would satisfy the value constraint on values (Constraint: Day-of-month Values) given above.

In such representations:

is a numeral consisting of at least four decimal digits, optionally preceded by a minus sign; leading 0 digits are prohibited except to bring the digit count up to four. It represents the value.

Subsequent -, T, and :, separate the various numerals.

, , , and are numerals consisting of exactly two decimal digits. They represent the , , , and values respectively.

is a numeral consisting of exactly two decimal digits, or two decimal digits, a decimal point, and one or more trailing digits. It represents the value.

Alternatively, combines the , , , and their separators to represent midnight of the day, which is the first moment of the next day.

, if present, specifies the timezone in which the moment occurs. Timezones are a count of minutes (expressed in as a count of hours and minutes) that are added or subtracted from UTC time to get the local time. Z is an alternative representation of the timezone of UTC, which is, of course, zero minutes from UTC.

For example, 2002-10-10T12:00:00−05:00 (noon on 10 October 2002, Central Daylight Savings Time as well as Eastern Standard Time in the U.S.) is equal to 2002-10-10T17:00:00Z, five hours later than 2002-10-10T12:00:00Z.

The production is equivalent to this regular expression once whitespace is removed. \-?([1-9][0-9][0-9][0-9]+)|(0[0-9][0-9][0-9])\-(0[1-9])|(1[0-2])\-(0[1-9])([12][0-9])|(3[01]) T(([01][0-9])|(2[0-3]):[0-5][0-9]:([0-5][0-9])(\.[0-9]+)?)|(24:00:00(\.0+)?) ([+\-](0[0-9])|(1[0-4]):[0-5][0-9])? Note that neither the production nor this regular expression alone enforce the constraint on given above.

The for is . The is .

Timezones

Timezones are durations with (integer-valued) hour and minute properties (with the hour magnitude limited to at most 14, and the minute magnitude limited to at most 59, except that if the hour magnitude is 14, the minute value must be 0); they may be both positive or both negative.

The lexical representation of a timezone is a string of the form: (('+' | '-') hh ':' mm) | 'Z', where

hh is a two-digit numeral (with leading zeroes as required) that represents the hours,

mm is a two-digit numeral that represents the minutes,

'+' indicates a nonnegative duration,

'-' indicates a nonpositive duration.

The mapping so defined is one-to-one, except that '+00:00', '-00:00', and 'Z' all represent the same zero-length duration timezone, ; 'Z' is its .

When a timezone is added to a dateTime, the result is the date and time "in that timezone". For example, 2002-10-10T12:00:00+05:00 is 2002-10-10T07:00:00Z and 2002-10-10T00:00:00+05:00 is 2002-10-09T19:00:00Z.

Order relation on dateTime

dateTime value objects on either timeline are totally ordered by their timeOnTimeline values; between the two timelines, dateTime value objects are ordered by their timeOnTimeline values when their timeOnTimeline values differ by more than fourteen hours, with those whose difference is a duration of 14 hours or less being incomparable.

In general, the on dateTime is a partial order since there is no determinate relationship between certain instants. For example, there is no determinate ordering between (a) 2000-01-20T12:00:00 and (b) 2000-01-20T12:00:00Z. Based on timezones currently in use, (c) could vary from 2000-01-20T12:00:00+12:00 to 2000-01-20T12:00:00-13:00. It is, however, possible for this range to expand or contract in the future, based on local laws. Because of this, the following definition uses a somewhat broader range of indeterminate values: +14:00..-14:00.

The following definition uses the notation S[year] to represent the year field of S, S[month] to represent the month field, and so on. The notation (Q & "-14:00") means adding the timezone -14:00 to Q, where Q did not already have a timezone. This is a logical explanation of the process. Actual implementations are free to optimize as long as they produce the same results.

The ordering between two dateTimes P and Q is defined by the following algorithm:

A.Normalize P and Q. That is, if there is a timezone present, but it is not Z, convert it to Z using the addition operation defined in

Thus 2000-03-04T23:00:00+03:00 normalizes to 2000-03-04T20:00:00Z

B. If P and Q either both have a time zone or both do not have a time zone, compare P and Q field by field from the year field down to the second field, and return a result as soon as it can be determined. That is:

For each i in {year, month, day, hour, minute, second}

If P[i] and Q[i] are both not specified, continue to the next i

If P[i] is not specified and Q[i] is, or vice versa, stop and return P <> Q

If P[i] < Q[i], stop and return P < Q

If P[i] > Q[i], stop and return P > Q

Stop and return P = Q

C.Otherwise, if P contains a time zone and Q does not, compare as follows:

P < Q if P < (Q with time zone +14:00)

P > Q if P > (Q with time zone -14:00)

P <> Q otherwise, that is, if (Q with time zone +14:00) < P < (Q with time zone -14:00)

D. Otherwise, if P does not contain a time zone and Q does, compare as follows:

P < Q if (P with time zone -14:00) < Q.

P > Q if (P with time zone +14:00) > Q.

P <> Q otherwise, that is, if (P with time zone +14:00) < Q < (P with time zone -14:00)

Examples:

Determinate	Indeterminate
2000-01-15T00:00:00 < 2000-02-15T00:00:00	2000-01-01T12:00:00 <> 1999-12-31T23:00:00Z
2000-01-15T12:00:00 < 2000-01-16T12:00:00Z	2000-01-16T12:00:00 <> 2000-01-16T12:00:00Z
	2000-01-16T00:00:00 <> 2000-01-16T12:00:00Z

Totally ordered dateTimes

Certain derived types from dateTime can be guaranteed have a total order. To do so, they must require that a specific set of fields are always specified, and that remaining fields (if any) are always unspecified. For example, the date datatype without time zone is defined to contain exactly year, month, and day. Thus dates without time zone have a total order among themselves.

time

time represents an instant of time that recurs every day. The of time is the space of time of day values as defined in § 5.3 of . Specifically, it is a set of zero-duration daily time instances.

represents instants of time that recur at the same point in each calendar day, or that occur in some arbitrary calendar day.

Since the lexical representation allows an optional time zone indicator, time values are partially ordered because it may not be able to determine the order of two values one of which has a time zone and the other does not. The order relation on time values is the using an arbitrary date. See also . Pairs of time values with or without time zone indicators are totally ordered.

See the conformance note in which applies to the seconds part of this datatype as well.

Value Space

uses the , with , , and required to be absent. remains .

See the conformance note in which applies to the value of this datatype.

Equality and order are as prescribed in . values (points in time in an arbitrary day) are ordered taking into account their .

A calendar ( or local time) day with an early timezone begins earlier than the same calendar day with a later timezone. Since the timezones allowed spread over 28 hours, there are timezone pairs for which a given calendar day in the two timezones are totally disjoint—the earlier day ends before the same day starts in the later timezone. The moments in time represented by a single calendar day are spread over a 52-hour interval, from the beginning of the day in the +14:00 timezone to the end of that day in the −14:00 timezone.

Since the order of a value having a with another value whose is absent is determined by imputing timezones of both +14:00 and −14:00 to the untimezoned value, many such combinations will be because the two imputed timezones yield different orders. However, for a given untimezoned value, there will always be timezoned values at one or both ends of the 52-hour interval that are comparable (because the interval of incomparability is only 24 hours wide).

Examples that show the difference from version 1.0 of this specification (see for the notations):

A day is a calendar (or local time) day in each timezone.

08:00:00+10:00 < 17:00:00+10:00 (just as 08:00:00Z has always been less than 17:00:00Z, but in version 1.0 08:00:00+10:00 > 17:00:00+10:00 )

A value in a calendar day with an early timezone may precede every value in a later calendar day:

00:00:00+01:00 is less than every value with Z

A calendar day with a very early timezone may be completely disjoint from a calendar day with a very late timezone:

Each value with +13:00 is less than every value with −13:00

values do not always convert to in the same way as in 1.0, since a time in a timezone may convert to a time on a different day (whereas time conversions in version 1.0 wrapped around by ignoring the day during conversion):

22:00:00Z > 03:00:00+05:00 (since 1971-12-31T03:00:00+05 is 1979-12-30T22:00:00Z, not 1979-12-31T22:00:00Z); in the previous version of this specification 22:00:00Z = 03:00:00+05:00 )

Lexical representation

The lexical representation for time is the left truncated lexical representation for : hh:mm:ss.sss with optional following time zone indicator. For example, to indicate 1:20 pm for Eastern Standard Time which is 5 hours behind Coordinated Universal Time (), one would write: 13:20:00-05:00. See also .

Canonical representation

The for time is defined by prohibiting certain options from the . Specifically, either the time zone must be omitted or, if present, the time zone must be Coordinated Universal Time () indicated by a "Z". Additionally, the for midnight is 00:00:00.

Lexical Mappings

The lexical representations for are projections of those of , as follows: Lexical Space timeLexicalRep (( : : ) | ) ? The production is equivalent to this regular expression, once whitespace is removed: (((([01][0-9])|(2[0-3])):([0-5][0-9]):(([0-5][0-9])(\.[0-9]+)?)) |(24:00:00(\.0+)?)) (Z|((+|-)(0[0-9]|1[0-4]):[0-5][0-9]))? Note that neither the production nor this regular expression alone enforce the constraint on given above.

The for is ; the is .

date

The of date consists of top-open intervals of exactly one day in length on the timelines of , beginning on the beginning moment of each day (in each timezone), i.e. '00:00:00', up to but not including '24:00:00' (which is identical with '00:00:00'date represents top-open intervals of exactly one day in length on the timelines of , beginning on the beginning moment of each day (in each timezone), up to but not including the beginning moment of the next day). For nontimezoned values, the top-open intervals disjointly cover the nontimezoned timeline, one per day. For timezoned values, the intervals begin at every minute and therefore overlap.

A "date object" is an object with year, month, and day properties just like those of objects, plus an optional timezone-valued timezone property. (As with values of timezones are a special case of durations.) Just as a object corresponds to a point on one of the timelines, a date object corresponds to an interval on one of the two timelines as just described.

Timezoned date values track the starting moment of their day, as determined by their timezone; said timezone is generally recoverable for canonical representations. The recoverable timezone is that duration which is the result of subtracting the first moment (or any moment) of the timezoned date from the first moment (or the corresponding moment) on the same date. s are always durations between '+12:00' and '-11:59'. This "timezone normalization" (which follows automatically from the definition of the date ) is explained more in .

For example: the first moment of 2002-10-10+13:00 is 2002-10-10T00:00:00+13, which is 2002-10-09T11:00:00Z, which is also the first moment of 2002-10-09-11:00. Therefore 2002-10-10+13:00 is 2002-10-09-11:00; they are the same interval.

For most timezones, either the first moment or last moment of the day (a value, always ) will have a date portion different from that of the date itself! However, noon of that date (the midpoint of the interval) in that (normalized) timezone will always have the same date portion as the date itself, even when that noon point in time is normalized to . For example, 2002-10-10-05:00 begins during 2002-10-09Z and 2002-10-10+05:00 ends during 2002-10-11Z, but noon of both 2002-10-10-05:00 and 2002-10-10+05:00 falls in the interval which is 2002-10-10Z.

See the conformance note in which applies to the year part of this datatype as well.

Value Space

uses the , with , , and required to be absent. remains .

Day-of-month Values

The value must be no more than 30 if is one of 4, 6, 9, or 11, no more than 28 if is 2 and is not divisble 4, or is divisible by 100 but not by 400, and no more than 29 if is 2 and is divisible by 400, or by 4 but not by 100.

See the conformance note in which applies to the value of this datatype.

Equality and order are as prescribed in .

In version 1.0 of this specification, values did not retain a timezone explicitly, but for timezones not too far from their timezone could be recovered based on their value's first moment on the timeline. The retains all timezones.

Examples that show the difference from version 1.0 (see for the notations):

A day is a calendar (or local time) day in each timezone, including the timezones outside of +12:00 through -11:59 inclusive:

2000-12-12+13:00 < 2000-12-12+11:00 (just as 2000-12-12+12:00 has always been less than 2000-12-12+11:00, but in version 1.0 2000-12-12+13:00 > 2000-12-12+11:00 , since 2000-12-12+13:00's recoverable timezone was −11:00)

Similarly:

2000-12-12+13:00 = 2000-12-13−11:00 (whereas under 1.0, as just stated, 2000-12-12+13:00 = 2000-12-12−11:00)

Lexical representation

For the following discussion, let the "date portion" of a or date object be an object similar to a or date object, with similar year, month, and day properties, but no others, having the same value for these properties as the original or date object.

The of date consists of finite-length sequences of characters of the form: '-'? yyyy '-' mm '-' dd zzzzzz? where the date and optional timezone are represented exactly the same way as they are for . The first moment of the interval is that represented by: '-' yyyy '-' mm '-' dd 'T00:00:00' zzzzzz? and the least upper bound of the interval is the timeline point represented (noncanonically) by: '-' yyyy '-' mm '-' dd 'T24:00:00' zzzzzz?.

The of a date will always be a duration between '+12:00' and '11:59'. Timezone lexical representations, as explained for , can range from '+14:00' to '-14:00'. The result is that literals of dates with very large or very negative timezones will map to a "normalized" date value with a different from that represented in the original representation, and a matching difference of +/- 1 day in the date itself.

Canonical representation

Given a member of the date , the date portion of the (the entire representation for nontimezoned values, and all but the timezone representation for timezoned values) is always the date portion of the of the interval midpoint (the representation, truncated on the right to eliminate 'T' and all following characters). For timezoned values, append the canonical representation of the .

Lexical Mappings

The lexical representations for are projections of those of , as follows: Lexical Space dateLexicalRep - - ? Day-of-month Representations

Within a , a must not begin with the digit 3 or be 29 unless the value to which it would map would satisfy the value constraint on values (Constraint: Day-of-month Values) given above.

The production is equivalent to this regular expression:

\-?([1-9][0-9][0-9][0-9]+)|(0[0-9][0-9][0-9])\-(0[1-9])|(1[0-2])\-([0-2][0-9])|(3[01])((+|\-)(0[0-9]|1[0-4]):[0-5][0-9])?

Note that neither the production nor this regular expression alone enforce the constraint on given above.

The for is . The is .

gYearMonth

gYearMonth represents a specific gregorian month in a specific gregorian year. The of gYearMonth is the set of Gregorian calendar months as defined in § 5.2.1 of . Specifically, it is a set of one-month long, non-periodic instances e.g. 1999-10 to represent the whole month of 1999-10, independent of how many days this month has.

gYearMonth represents specific whole Gregorian months in specific Gregorian years.

Since the lexical representation allows an optional time zone indicator, gYearMonth values are partially ordered because it may not be possible to unequivocally determine the order of two values one of which has a time zone and the other does not. If gYearMonth values are considered as periods of time, the order relation on gYearMonth values is the order relation on their starting instants. This is discussed in . See also . Pairs of gYearMonth values with or without time zone indicators are totally ordered.

Because month/year combinations in one calendar only rarely correspond to month/year combinations in other calendars, values of this type are not, in general, convertible to simple values corresponding to month/year combinations in other calendars. This type should therefore be used with caution in contexts where conversion to other calendars is desired.

See the conformance note in which applies to the year part of this datatype as well.

Value Space

uses the , with , , , and required to be absent. remains .

See the conformance note in which applies to the value of this datatype.

Equality and order are as prescribed in .

In version 1.0 of this specification, values did not retain a timezone explicitly, but timezones not too far from could be recovered based on the value's first moment on the timeline. The simply retains all timezones.

An example that shows the difference from version 1.0 (see for the notations):

A day is a calendar (or local time) day in each timezone, including the timezones outside of +12:00 through −11:59 inclusive:

2000-12+13:00 < 2000-12+11:00 (just as 2000-12+12:00 has always been less than 2000−12+11:00, but in version 1.0 2000-12+13:00 > 2000-12+11:00 , since 2000−12+13:00's recoverable timezone was −11:00)

Lexical representationMappings

The lexical representation for gYearMonth is the reduced (right truncated) lexical representation for : CCYY-MM. No left truncation is allowed. An optional following time zone qualifier is allowed. To accommodate year values outside the range from 0001 to 9999, additional digits can be added to the left of this representation and a preceding "-" sign is allowed.

For example, to indicate the month of May 1999, one would write: 1999-05. See also .

The lexical representations for are projections of those of , as follows: Lexical Space gYearMonthLexicalRep - ? The is equivalent to this regular expression: \-?([1-9][0-9][0-9][0-9]+)|(0[0-9][0-9][0-9])\-(0[1-9])|(1[0-2])((+|\-)(0[0-9]|1[0-4]):[0-5][0-9])?

The and for are the following functions:

The for is . The is .

gYear

gYear represents a gregorian calendar year. The of gYear is the set of Gregorian calendar years as defined in § 5.2.1 of . Specifically, it is a set of one-year long, non-periodic instances e.g. lexical 1999 to represent the whole year 1999, independent of how many months and days this year has.

gYear represents Gregorian calendar years.

Since the lexical representation allows an optional time zone indicator, gYear values are partially ordered because it may not be possible to unequivocally determine the order of two values one of which has a time zone and the other does not. If gYear values are considered as periods of time, the order relation on gYear values is the order relation on their starting instants. This is discussed in . See also . Pairs of gYear values with or without time zone indicators are totally ordered.

Because years in one calendar only rarely correspond to years in other calendars, values of this type are not, in general, convertible to simple values corresponding to years in other calendars. This type should therefore be used with caution in contexts where conversion to other calendars is desired.

See the conformance note in which applies to the year part of this datatype as well.

Value Space

uses the , with , , , , and required to be absent. remains .

See the conformance note in which applies to the value of this datatype.

Equality and order are as prescribed in .

An example that shows the difference from version 1.0 (see for the notations):

A day is a calendar (or local time) day in each timezone, including the timezones outside of +12:00 through −11:59 inclusive:

2000+13:00 < 2000+11:00 (just as 2000+12:00 has always been less than 2000+11:00, but in version 1.0 2000+13:00 > 2000+11:00 , since 2000+13:00's recoverable timezone was −11:00)

Lexical representationMappings

The lexical representation for gYear is the reduced (right truncated) lexical representation for : CCYY. No left truncation is allowed. An optional following time zone qualifier is allowed as for . To accommodate year values outside the range from 0001 to 9999, additional digits can be added to the left of this representation and a preceding "-" sign is allowed.

For example, to indicate 1999, one would write: 1999. See also .

The lexical representations for are projections of those of , as follows: Lexical Space gYearLexicalRep - ? The is equivalent to this regular expression: \-?([1-9][0-9][0-9][0-9]+)|(0[0-9][0-9][0-9])((+|\-)(0[0-9]|1[0-4]):[0-5][0-9])?

The and for are the following functions:

The for is . The is .

gMonthDay

gMonthDay is a gregorian date that recurs, specifically a day of the year such as the third of May. Arbitrary recurring dates are not supported by this datatype. The of gMonthDay is the set of calendar dates, as defined in § 3 of . Specifically, it is a set of one-day long, annually periodic instances.

represents whole calendar days that recur at the same point in each calendar year, or that occur in some arbitrary calendar year.

This datatype can be used, for example, to record birthdays; an instance of the datatype could be used to say that someone's birthday occurs on the 14th of September every year.

Since the lexical representation allows an optional time zone indicator, gMonthDay values are partially ordered because it may not be possible to unequivocally determine the order of two values one of which has a time zone and the other does not. If gMonthDay values are considered as periods of time, in an arbitrary leap year, the order relation on gMonthDay values is the order relation on their starting instants. This is discussed in . See also . Pairs of gMonthDay values with or without time zone indicators are totally ordered.

Because day/month combinations in one calendar only rarely correspond to day/month combinations in other calendars, values of this type do not, in general, have any straightforward or intuitive representation in terms of most other calendars. This type should therefore be used with caution in contexts where conversion to other calendars is desired.

Value Space

uses the , with , , , and required to be absent. remains .

Day-of-month Values

The value must be no more than 30 if is one of 4, 6, 9, or 11, and no more than 29 if is 2.

Equality and order are as prescribed in .

An example that shows the difference from version 1.0 (see for the notations):

A day is a calendar (or local time) day in each timezone, including the timezones outside of +12:00 through −11:59 inclusive:

--12-12+13:00 < --12-12+11:00 (just as --12-12+12:00 has always been less than --12-12+11:00, but in version 1.0 --12-12+13:00 > --12-12+11:00 , since --12-12+13:00's recoverable timezone was −11:00)

Lexical representationMappings

The lexical representation for gMonthDay is the left truncated lexical representation for : --MM-DD. An optional following time zone qualifier is allowed as for . No preceding sign is allowed. No other formats are allowed. See also .

The lexical representations for are projections of those of , as follows: Lexical Space gMonthDayLexicalRep -- - ? Day-of-month Representations

Within a , a must not begin with the digit 3 or be 29 unless the value to which it would map would satisfy the value constraint on values (Constraint: Day-of-month Values) given above.

The is equivalent to this regular expression: \-\-(0[1-9])|(1[0-2])\-([0-2][0-9])|(3[01])((+|\-)(0[0-9]|1[0-4]):[0-5][0-9])? Note that neither the production nor this regular expression alone enforce the constraint on given above.

This datatype can be used to represent a specific day in a month. To say, for example, that my birthday occurs on the 14th of September ever year.

The and for are the following functions:

The for is . The is .

gDay

gDay is a gregorian day that recurs, specifically a day of the month such as the 5th of the month. Arbitrary recurring days are not supported by this datatype. The of gDay is the space of a set of calendar dates as defined in § 3 of . Specifically, it is a set of one-day long, monthly periodic instances.

gDay represents whole days within an arbitrary month—days that recur at the same point in each (Gregorian) month. This datatype can beis used to represent a specific day of the month. To say, for example, that I get my paycheckindicate, for example, that an employee gets a paycheck on the 15th of each month. (Obviously, days beyond 28 cannot occur in all months; they are nonetheless permitted, up to 31.)

Since the lexical representation allows an optional time zone indicator, gDay values are partially ordered because it may not be possible to unequivocally determine the order of two values one of which has a time zone and the other does not. If gDay values are considered as periods of time, in an arbitrary month that has 31 days, the order relation on gDay values is the order relation on their starting instants. This is discussed in . See also . Pairs of gDay values with or without time zone indicators are totally ordered.

Because days in one calendar only rarely correspond to days in other calendars, gday values of this type do not, in general, have any straightforward or intuitive representation in terms of most othernon-Gregorian calendars. This typegday should therefore be used with caution in contexts where conversion to other calendars is desired.

Value Space

uses the , with , , , , and required to be absent. remains and must be between 1 and 31 inclusive.

Equality and order are as prescribed in . Since values (days) are ordered by their first moments, it is possible for apparent anomalies to appear in the order when values differ by at least 24 hours. (It is possible for values to differ by up to 28 hours.)

Examples that may appear anomalous (see for the notations):

---15 < ---16 , but ---15−13:00 > ---16+13:00

---15−11:00 = ---16+13:00

---15−13:00 <> ---16 , because ---15−13:00 > ---16+14:00 and ---15−13:00 < 16−14:00

Timezones do not cause wrap-around at the end of the month: the last day of a given month in timezone −13:00 may start after the first day of the next month in timezone +13:00, as measured on the global timeline, but nonetheless ---01+13:00 < ---31−13:00 .

Lexical representation

The lexical representation for gDay is the left truncated lexical representation for : ---DD . An optional following time zone qualifier is allowed as for . No preceding sign is allowed. No other formats are allowed. See also .

Lexical Mappings

The lexical representations for are projections of those of , as follows: Lexical Space gDayLexicalRep --- ? The is equivalent to this regular expression: \-\-\-([0-2][0-9]|3[01])((+|\-)(0[0-9]|1[0-4]):[0-5][0-9])?

The for is . The is .

gMonth

gMonth is a gregorian month that recurs every year. The of gMonth is the space of a set of calendar months as defined in § 3 of . Specifically, it is a set of one-month long, yearly periodic instances.

This datatype can be used to represent a specific month. To say, for example, that Thanksgiving falls in the month of November.gMonth represents whole (Gregorian) months within an arbitrary year—months that recur at the same point in each year. It might be used, for example, to say what month annual Thanksgiving celebrations fall in different countries (--11 in the United States, --10 in Canada, and possibly other months in other countries).

Since the lexical representation allows an optional time zone indicator, gMonth values are partially ordered because it may not be possible to unequivocally determine the order of two values one of which has a time zone and the other does not. If gMonth values are considered as periods of time, the order relation on gMonth is the order relation on their starting instants. This is discussed in . See also . Pairs of gMonth values with or without time zone indicators are totally ordered.

Because months in one calendar only rarely correspond to months in other calendars, values of this type do not, in general, have any straightforward or intuitive representation in terms of most other calendars. This type should therefore be used with caution in contexts where conversion to other calendars is desired.

Value Space

uses the , with , , , , and required to be absent. remains .

Equality and order are as prescribed in .

An example that shows the difference from version 1.0 (see for the notations):

A month is a calendar (or local time) month in each timezone, including the timezones outside of +12:00 through −11:59 inclusive:

--12+13:00 < --12+11:00 (just as --12+12:00 has always been less than --12+11:00, but in version 1.0 --12+13:00 > --12+11:00 , since --12+13:00's recoverable timezone was −11:00)

Lexical representationMappings

The lexical representation for gMonth is the left and right truncated lexical representation for : --MM. An optional following time zone qualifier is allowed as for . No preceding sign is allowed. No other formats are allowed. See also .

The lexical representations for are projections of those of , as follows: Lexical Space gMonthLexicalRep -- ? The is equivalent to this regular expression: \-\-(0[1-9])|(1[0-2])((+|\-)(0[0-9]|1[0-4]):[0-5][0-9])?

The and for are defined as follows:

The for is . The is .

hexBinary

hexBinary represents arbitrary hex-encoded binary data.

Value Space

The of hexBinary is the set of finite-length sequences of binary octets.

Lexical Representation

hexBinary has a where each binary octet is encoded as a character tuple, consisting of two hexadecimal digits ([0-9a-fA-F]) representing the octet code. For example, 0FB7 is a hex encoding for the 16-bit integer 4023 (whose binary representation is 111110110111).

More formally, the of is the set of literals matching the production. Lexical space of hexBinary hexDigit 0-9a-fA-F hexOctet hexBinary *

The set recognized by is the same as that recognized by the regular expression ([0-9a-fA-F]{2})*.

The of is .

Canonical Representation

The for hexBinary is defined by prohibiting certain options from the . Specifically, the lower case hexadecimal digits ([a-f]) are not allowed.

The of is given formally in .

base64Binary

base64Binary represents arbitrary Base64-encoded arbitrary binary data. The of base64Binary is the set of finite-length sequences of binary octets. For base64Binary data the entire binary stream is encoded using the Base64 Alphabet in Encoding defined in , which is derived from the encoding described in .

Value Space

The of is the set of finite-length sequences of binary octets.

Lexical Representation

The lexical formslexical representations of base64Binary values are limited to the 65 characters of the Base64 Alphabet defined in , i.e., a-z, A-Z, 0-9, the plus sign (+), the forward slash (/) and the equal sign (=), together with the characters defined in as white spacethe space character (#x20). No other characters are allowed.

For compatibility with older mail gateways, suggests that bBase64 data should have lines limited to at most 76 characters in length. This line-length limitation is not required by and is not mandated in the lexical formslexical representations of base64Binary data and . It must notmust not be enforced by XML Schema processors.

The of base64Binary is given by the following grammar (the notation is that used in ); legal lexical forms must matchthe set of literals which the Base64Binary production.

Base64Binary ::= ((B64S B64S B64S B64S)* ((B64S B64S B64S B64) | (B64S B64S B16S '=') | (B64S B04S '=' #x20? '=')))? B64S ::= B64 #x20? B16S ::= B16 #x20? B04S ::= B04 #x20? B04 ::= [AQgw] B16 ::= [AEIMQUYcgkosw048] B64 ::= [A-Za-z0-9+/]

Lexical space of base64Binary Base64Binary (* )? B64quad ( ) B64quad represents three octets of binary data. B64final | | B64finalquad ( ) B64finalquad represents three octets of binary data without trailing space. Padded16 = Padded16 represents a two-octet at the end of the data. Padded8 = #x20? = Padded8 represents a single octet at the end of the data. B64 #x20? B64char[A-Za-z0-9+/] B16 #x20? B16char[AEIMQUYcgkosw048] Base64 characters whose bit-string value ends in '00' B04 #x20? B04char[AQgw] Base64 characters whose bit-string value ends in '0000'

Note that this grammar requires the number of non-whitespace characters in the lexical form to be a multiple of four, and for equals signs to appear only at the end of the lexical form; stringsliterals which do not meet these constraints are not legal lexical formslexical representations of base64Binary because they cannot successfully be decoded by bBase64 decoders.

The for is as given in and .

The above definition of the is more restrictive than that given in as regards whitespace — and less restrictive than . thisThis is not an issue in practice. Any string compatible with theeither RFC can occur in an element or attribute validated by this type, because the facet of this type is fixed to collapse, which means that all leading and trailing whitespace will be stripped, and all internal whitespace collapsed to single space characters, before the above grammar is enforced. The possibility of ignoring whitespace in Base64 data is foreseen in clause 2.3 of , but for the reasons given there this specification does not allow implementations to ignore non-whitespace characters which are not in the Base64 Alphabet.

The canonical lexical form of a base64Binary data value is the bBase64 encoding of the value which matches the Canonical-base64Binary production in the following grammar:

Canonical-base64Binary ::= (B64 B64 B64 B64)* ((B64 B64 B16 '=') | (B64 B04 '=='))?

Canonical representation of base64Binary Canonical-base64Binary * ? CanonicalQuad CanonicalPadded = | ==

That is, the of a value is the which maps to that value and contains no whitespace. The for is thus the encoding algorithm for Base64 data given in and , with the proviso that no characters except those in the Base64 Alphabet are to be written out.

For some values the canonical formrepresentation defined above does not conform to , which requires breaking with linefeeds at appropriate intervals. It does conform with .

The length of a base64Binary value is the number of octets it contains. This may be calculated from the lexical form by removing whitespace and padding characters and performing the calculation shown in the pseudo-code below:

lex2 := killwhitespace(lexform) -- remove whitespace characters lex3 := strip_equals(lex2) -- strip padding characters at end length := floor (length(lex3) * 3 / 4) -- calculate length

Note on encoding: and explicitly references US-ASCII encoding. However, decoding of base64Binary data in an XML entity is to be performed on the Unicode characters obtained after character encoding processing as specified by .

anyURI

anyURI represents a Uniform Resource Identifier Reference (URI)an Internationalized Resource Identifier Reference (IRI). An anyURI value can be absolute or relative, and may have an optional fragment identifier (i.e., it may be a URIan IRI Reference). This type should be used to specify the intention thatwhen the value fulfills the role of a URI as defined by , as amended by an IRI, as defined in or its successor(s) in the IETF Standards Track.

IRIs may be used to locate resources or simply to identify them. In the case where they are used to locate resources using a URI, applications should use theThe mapping from anyURI values to URIs is as definedgiven by the URI reference escaping procedure defined in Section 5.4 Locator Attribute of (see also Section 8 Character Encoding in URI References of )Section 3.1 Mapping of IRIs to URIs of or its successor(s) in the IETF Standards Track. This means that a wide range of internationalized resource identifiers can be specified when an anyURI is called for, and still be understood as URIs per , as amended by , where appropriate to identify resources and its successor(s).

Section 5.4 Locator Attribute of requires that relative URI references be absolutized as defined in before use. This is an XLink-specific requirement and is not appropriate for XML Schema, since neither the nor the of the type are restricted to absolute URIs. Accordingly absolutization must not be performed by schema processors as part of schema validation.

Each URI scheme imposes specialized syntax rules for URIs in that scheme, including restrictions on the syntax of allowed fragment identifiers. Because it is impractical for processors to check that a value is a context-appropriate URI reference, this specification follows the lead of (as amended by ) in this matter: such rules and restrictions are not part of type validity and are not checked by processors. Thus in practice the above definition imposes only very modest obligations on processors.

Lexical representationmapping

The of anyURI is finite-length character sequences which, when the algorithm defined in Section 5.4 of is applied to them, result in strings which are legal URIs according to , as amended by .

For an value to be usable in practice as an IRI, the result of applying to it the algorithm defined in Section 3.1 of should be a string which is a legal URI according to . (This is true at the time this document is published; if in the future and are replaced by other specifications in the IETF Standards Track, the relevant constraints will be those imposed by those successor specifications.)

Each URI scheme imposes specialized syntax rules for URIs in that scheme, including restrictions on the syntax of allowed fragment identifiers. Because it is impractical for processors to check that a value is a context-appropriate URI reference, neither the syntactic constraints defined by the definitions of individual schemes nor the generic syntactic constraints defined by and and their successors are part of this datatype as defined here. Applications which depend on values being legal according to the rules of the relevant specifications should make arrangements to check values against the appropriate definitions of IRI, URI, and specific schemes.

Spaces are, in principle, allowed in the of anyURI, however, their use is highly discouraged (unless they are encoded by %20).

The for is the identity mapping.

The definitions of URI in the current IETF specifications define certain URIs as equivalent to each other. Those equivalences are not part of this datatype as defined here: if two equivalent URIs or IRIs are different character sequences, they map to different values in this datatype.

QName

QName represents XML qualified names. The of QName is the set of tuples {namespace name, local part}, where namespace name is an and local part is an . The of QName is the set of strings that the QName production of .

It is implementation-defined whether an implementation of this specification supports the QName production from , or that from , or both. See .

The mapping between literals in the and values in the of QName requires a namespace declaration to be in scope for the context in which QName is used.

Because the lexical representations available for any value of type vary with context, no is defined for QName in this specification.

NOTATION

NOTATION represents the NOTATION attribute type from . The of NOTATION is the set of s of notations declared in the current schema. The of NOTATION is the set of all names of notations declared in the current schema (in the form of s).

enumeration facet value required for NOTATION

It is an for NOTATION to be used directly in a schema. Only datatypes that are derived from NOTATION by specifying a value for can be used in a schema.

For compatibility (see ) NOTATION should be used only on attributes and should only be used in schemas with no target namespace.

Because the lexical representations available for any given value of vary with context, this specification defines no for values.

Derived datatypesOther Built-in Datatypes

This section gives conceptual definitions for all datatypes defined by this specification. The XML representation used to define datatypes (whether or ) is given in section and the complete definitions of the datatypes are provided in Appendix Athe appendix .

normalizedString

normalizedString represents white space normalized strings. The of normalizedString is the set of strings that do not contain the carriage return (#xD), line feed (#xA) nor tab (#x9) characters. The of normalizedString is the set of strings that do not contain the carriage return (#xD), line feed (#xA) nor tab (#x9) characters. The of normalizedString is .

Derived datatypes token

token represents tokenized strings. The of token is the set of strings that do not contain the carriage return (#xD), line feed (#xA) nor tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces. The of token is the set of strings that do not contain the carriage return (#xD), line feed (#xA) nor tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces. The of token is .

Derived datatypes language

language represents formal natural language identifiers, as defined by or its successor(s) in the IETF Standards Track. The of language is the set of all strings that are valid language identifiers as defined . The and of language isare the set of all strings that conform to the pattern [a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})* , This is the set of strings accepted by the grammar given in . The of language is .

The regular expression above provides the only normative constraint on the lexical and value spaces of this type. The additional constraints imposed on language identifiers by and its successor(s), and in particular their requirement that language codes be registered with IANA or ISO if not given in ISO 639, are not part of this datatype as defined here.

specifies that language codes are to be treated as case insensitive; there exist conventions for capitalization of some of them, but these should not be taken to carry meaning. For instance, [ISO 3166] recommends that country codes are capitalized (MN Mongolia), while [ISO 639] recommends that language codes are written in lower case (mn Mongolian). Since the datatype is derived from , it inherits from a one-to-one mapping from lexical representations to values. The literals MN and mn therefore correspond to distinct values and have distinct canonical forms. Users of this specification should be aware of this fact, the consequence of which is that the case-insensitive treatment of language values prescribed by does not follow from the definition of this datatype given here; applications which require case-sensitivity should make appropriate adjustments.

NMTOKEN

NMTOKEN represents the NMTOKEN attribute type from . The of NMTOKEN is the set of tokens that the Nmtoken production in . The of NMTOKEN is the set of strings that the Nmtoken production in . The of NMTOKEN is .

It is implementation-defined whether an implementation of this specification supports the NMTOKEN production from , or that from , or both. See .

For compatibility (see ) NMTOKEN should be used only on attributes.

DerivedRelated datatypes NMTOKENS

NMTOKENS represents the NMTOKENS attribute type from . The of NMTOKENS is the set of finite, non-zero-length sequences of s. The of NMTOKENS is the set of space-separated lists of tokens, of which each token is in the of . The of NMTOKENS is .

For compatibility (see ) NMTOKENS should be used only on attributes.

Name

Name represents XML Names. The of Name is the set of all strings which the Name production of . The of Name is the set of all strings which the Name production of . The of Name is .

It is implementation-defined whether an implementation of this specification supports the Name production from , or that from , or both. See .

Derived datatypes NCName

NCName represents XML "non-colonized" Names. The of NCName is the set of all strings which the NCName production of . The of NCName is the set of all strings which the NCName production of . The of NCName is .

It is implementation-defined whether an implementation of this specification supports the NCName production from , or that from , or both. See .

Derived datatypes ID

ID represents the ID attribute type from . The of ID is the set of all strings that the NCName production in . The of ID is the set of all strings that the NCName production in . The of ID is .

It is implementation-defined whether an implementation of this specification supports the NCName production from , or that from , or both. See .

For compatibility (see ) ID should be used only on attributes.

Uniqueness of items validated as is not part of this datatype as defined here. When this specification is used in conjunction with , uniqueness is enforced at a different level, not as part of datatype validity; see Validation Rule: Validation Root Valid (ID/IDREF) in .

IDREF

IDREF represents the IDREF attribute type from . The of IDREF is the set of all strings that the NCName production in . The of IDREF is the set of strings that the NCName production in . The of IDREF is .

It is implementation-defined whether an implementation of this specification supports the NCName production from , or that from , or both. See .

For compatibility (see ) this datatype should be used only on attributes.

Existence of referents for items validated as is not part of this datatype as defined here. When this specification is used in conjunction with , referential integrity is enforced at a different level, not as part of datatype validity; see Validation Rule: Validation Root Valid (ID/IDREF) in .

DerivedRelated datatypes IDREFS

IDREFS represents the IDREFS attribute type from . The of IDREFS is the set of finite, non-zero-length sequences of s. The of IDREFS is the set of space-separated lists of tokens, of which each token is in the of . The of IDREFS is .

For compatibility (see ) IDREFS should be used only on attributes.

ENTITY

ENTITY represents the ENTITY attribute type from . The of ENTITY is the set of all strings that the NCName production in and have been declared as an unparsed entity in a document type definition. The of ENTITY is the set of all strings that the NCName production in . The of ENTITY is .

It is implementation-defined whether an implementation of this specification supports the NCName production from , or that from , or both. See .

The of ENTITY is scoped to a specific instance document.

For compatibility (see ) ENTITY should be used only on attributes.

DerivedRelated datatypes ENTITIES

ENTITIES represents the ENTITIES attribute type from . The of ENTITIES is the set of finite, non-zero-length sequences of s that have been declared as unparsed entities in a document type definition. The of ENTITIES is the set of space-separated lists of tokens, of which each token is in the of . The of ENTITIES is .

The of ENTITIES is scoped to a specific instance document.

For compatibility (see ) ENTITIES should be used only on attributes.

integer

integer is derived from by fixing the value of to be 0 and disallowing the trailing decimal point. This results in the standard mathematical concept of the integer numbers. The of integer is the infinite set {...,-2,-1,0,1,2,...}. The of integer is .

Lexical representation

integer has a lexical representation consisting of a finite-length sequence of decimal digits (#x30-#x39) with an optional leading sign. If the sign is omitted, "+" is assumed. For example: -1, 0, 12678967543233, +100000.

Canonical representation

The for integer is defined by prohibiting certain options from the . Specifically, the preceding optional "+" sign is prohibited and leading zeroes are prohibited.

Derived datatypes nonPositiveInteger

nonPositiveInteger is derived from by setting the value of to be 0. This results in the standard mathematical concept of the non-positive integers. The of nonPositiveInteger is the infinite set {...,-2,-1,0}. The of nonPositiveInteger is .

Lexical representation

nonPositiveInteger has a lexical representation consisting of an optional preceding sign followed by a finite-length sequence of decimal digits (#x30-#x39). The sign may be "+" or may be omitted only for lexical forms denoting zero; in all other lexical forms, the negative sign (-) must be present. For example: -1, 0, -12678967543233, -100000.

Canonical representation

The for nonPositiveInteger is defined by prohibiting certain options from the . In the canonical form for zero, the sign must be omitted. Leading zeroes are prohibited.

Derived datatypes negativeInteger

negativeInteger is derived from by setting the value of to be -1. This results in the standard mathematical concept of the negative integers. The of negativeInteger is the infinite set {...,-2,-1}. The of negativeInteger is .

Lexical representation

negativeInteger has a lexical representation consisting of a negative sign (-) followed by a finite-length sequence of decimal digits (#x30-#x39). For example: -1, -12678967543233, -100000.

Canonical representation

The for negativeInteger is defined by prohibiting certain options from the . Specifically, leading zeroes are prohibited.

long

long is derived from by setting the value of to be 9223372036854775807 and to be -9223372036854775808. The of long is .

Lexical representation

long has a lexical representation consisting of an optional sign followed by a finite-length sequence of decimal digits (#x30-#x39). If the sign is omitted, "+" is assumed. For example: -1, 0, 12678967543233, +100000.

Canonical representation

The for long is defined by prohibiting certain options from the . Specifically, the the optional "+" sign is prohibited and leading zeroes are prohibited.

Derived datatypes int

int is derived from by setting the value of to be 2147483647 and to be -2147483648. The of int is .

Lexical representation

int has a lexical representation consisting of an optional sign followed by a finite-length sequence of decimal digits (#x30-#x39). If the sign is omitted, "+" is assumed. For example: -1, 0, 126789675, +100000.

Canonical representation

The for int is defined by prohibiting certain options from the . Specifically, the the optional "+" sign is prohibited and leading zeroes are prohibited.

Derived datatypes short

short is derived from by setting the value of to be 32767 and to be -32768. The of short is .

Lexical representation

short has a lexical representation consisting of an optional sign followed by a finite-length sequence of decimal digits (#x30-#x39). If the sign is omitted, "+" is assumed. For example: -1, 0, 12678, +10000.

Canonical representation

The for short is defined by prohibiting certain options from the . Specifically, the the optional "+" sign is prohibited and leading zeroes are prohibited.

Derived datatypes byte

byte is derived from by setting the value of to be 127 and to be -128. The of byte is .

Lexical representation

byte has a lexical representation consisting of an optional sign followed by a finite-length sequence of decimal digits (#x30-#x39). If the sign is omitted, "+" is assumed. For example: -1, 0, 126, +100.

Canonical representation

The for byte is defined by prohibiting certain options from the . Specifically, the the optional "+" sign is prohibited and leading zeroes are prohibited.

nonNegativeInteger

nonNegativeInteger is derived from by setting the value of to be 0. This results in the standard mathematical concept of the non-negative integers. The of nonNegativeInteger is the infinite set {0,1,2,...}. The of nonNegativeInteger is .

Lexical representation

nonNegativeInteger has a lexical representation consisting of an optional sign followed by a finite-length sequence of decimal digits (#x30-#x39). If the sign is omitted, the positive sign (+) is assumed. If the sign is present, it must be "+" except for lexical forms denoting zero, which may be preceded by a positive (+) or a negative (-) sign. For example: 1, 0, 12678967543233, +100000.

Canonical representation

The for nonNegativeInteger is defined by prohibiting certain options from the . Specifically, the the optional "+" sign is prohibited and leading zeroes are prohibited.

Derived datatypes unsignedLong

unsignedLong is derived from by setting the value of to be 18446744073709551615. The of unsignedLong is .

Lexical representation

unsignedLong has a lexical representation consisting of an optional sign followed by a finite-length sequence of decimal digits (#x30-#x39). If the sign is omitted, the positive sign (+) is assumed. If the sign is present, it must be + except for lexical forms denoting zero, which may be preceded by a positive (+) or a negative (-) sign. For example: 0, 12678967543233, 100000.

Canonical representation

The for unsignedLong is defined by prohibiting certain options from the . Specifically, leading zeroes are prohibited.

Derived datatypes unsignedInt

unsignedInt is derived from by setting the value of to be 4294967295. The of unsignedInt is .

Lexical representation

unsignedInt has a lexical representation consisting of an optional sign followed by a finite-length sequence of decimal digits (#x30-#x39). If the sign is omitted, the positive sign (+) is assumed. If the sign is present, it must be + except for lexical forms denoting zero, which may be preceded by a positive (+) or a negative (-) sign. For example: 0, 1267896754, 100000.

Canonical representation

The for unsignedInt is defined by prohibiting certain options from the . Specifically, leading zeroes are prohibited.

Derived datatypes unsignedShort

unsignedShort is derived from by setting the value of to be 65535. The of unsignedShort is .

Lexical representation

unsignedShort has a lexical representation consisting of an optional sign followed by a finite-length sequence of decimal digits (#x30-#x39). If the sign is omitted, the positive sign (+) is assumed. If the sign is present, it must be + except for lexical forms denoting zero, which may be preceded by a positive (+) or a negative (-) sign. For example: 0, 12678, 10000.

Canonical representation

The for unsignedShort is defined by prohibiting certain options from the . Specifically, the leading zeroes are prohibited.

Derived datatypes unsignedByte

unsignedByte is derived from by setting the value of to be 255. The of unsignedByte is .

Lexical representation

unsignedByte has a lexical representation consisting of an optional sign followed by a finite-length sequence of decimal digits (#x30-#x39). If the sign is omitted, the positive sign (+) is assumed. If the sign is present, it must be + except for lexical forms denoting zero, which may be preceded by a positive (+) or a negative (-) sign. For example: 0, 126, 100.

Canonical representation

The for unsignedByte is defined by prohibiting certain options from the . Specifically, leading zeroes are prohibited.

positiveInteger

positiveInteger is derived from by setting the value of to be 1. This results in the standard mathematical concept of the positive integer numbers. The of positiveInteger is the infinite set {1,2,...}. The of positiveInteger is .

Lexical representation

positiveInteger has a lexical representation consisting of an optional positive sign (+) followed by a finite-length sequence of decimal digits (#x30-#x39). For example: 1, 12678967543233, +100000.

Canonical representation

The for positiveInteger is defined by prohibiting certain options from the . Specifically, the optional "+" sign is prohibited and leading zeroes are prohibited.

yearMonthDuration

yearMonthDuration is a datatype derived from by restricting its lexical representations to instances of . The of yearMonthDuration is therefore that of restricted to those whose property is 0. This results in a duration datatype which is totally ordered.

The always-zero is formally retained in order that 's (abstract) value space truly be a subset of that of An obvious implementation optimization is to ignore the zero and implement values simply as values.

The Lexical Mapping

The lexical space is reduced from that of by disallowing and fragments in the lexical representations. The Lexical Representation yearMonthDurationLexicalRep -? P

The lexical space of consists of strings which match the regular expression -?P((([0-9]+Y)([0-9]+M)?)|([0-9]+M)) or the expression -?P[0-9]+(Y([0-9]+M)?|M), but the formal definition of uses a simpler regular expression in its facet: [^DT]*. This pattern matches only strings of characters which contain no D and no T, thus restricting the of to strings with no day, hour, minute, or seconds fields.

The is that of restricted in its range to the (which reduces its domain to omit any values not in the value space).

The value whose and are both zero has no in this datatype since its in (PT0S) is not in the of .

dayTimeDuration

dayTimeDuration is a datatype derived from by restricting its lexical representations to instances of . The of dayTimeDuration is therefore that of restricted to those whose property is 0. This results in a duration datatype which is totally ordered.

The Lexical Space

The lexical space is reduced from that of by disallowing and fragments in the lexical representations.

The Lexical Representation dayTimeDurationLexicalRep -? P

The lexical space of consists of strings in the of which match the regular expression [^YM]*[DT].*; this pattern eliminates all durations with year or month fields, leaving only those with day, hour, minutes, and/or seconds fields.

The is that of restricted in its range to the (which reduces its domain to omit any values not in the value space).

Datatype components

The preceding sections of this specification have described datatypes in a way largely independent of their use in the particular context of schema-aware processing as defined in .

This section presents the mechanisms necessary to integrate datatypes into the context of , mostly in terms of the Schema Component abstraction introduced there. The account of datatypes given in this specification is also intended to be useful in other contexts. Any specification or other formal system intending to use datatypes as defined above, particularly if definition of new datatypes via facet-based restriction is envisaged, will need to provide analogous mechanisms for some, but not necessarily all, of what follows below. For example, the and properties are required because of particular aspects of which are not in principle necessary for the use of datatypes as defined here.

The following sections provide full details on the properties and significance of each kind of schema component involved in datatype definitions. For each property, the kinds of values it is allowed to have is specified. Any property not identified as optional is required to be present; optional properties which are not present have absent as their value. Any property identified as a having a set, subset or value may have an empty value unless this is explicitly ruled out: this is not the same as absent. Any property value identified as a superset or a subset of some set may be equal to that set, unless a proper superset or subset is explicitly called for.

For more information on the notion of datatype (schema) components, see Schema Component Details of .

A component may be referred to as the owner of its properties, and of the values of those properties.

Simple Type Definition

Simple Type Definitions provide for:

Establishing the and of a datatype, through the combined set of constraining facets specified in the definition;

Attaching a unique name (actually a ) to the and .

In the case of datatypes, identifying a datatype with its definition in this specification.

In the case of datatypes, defining the datatype in terms of other datatypes.

Attaching a to the datatype.

The Simple Type Definition Schema Component

The Simple Type Definition schema component has the following properties:

DatatypesSimple type definitions are identified by their and . Except for anonymous datatypess (those with no ), datatype definitionss must be uniquely identified within a schema. Within a valid schema, each uniquely determines one datatype. The , , , etc., of a are the , , etc., of the datatype uniquely determined (or defined) by that .

If is then the of the datatype defined will be a subset of the of (which is a subset of the of ). If is then the of the datatype defined will be the set of finite-length sequences of values from the of . If is then the of the datatype defined will be a subset (possibly an improper subset) of the union of the value spaces of each datatype in .

If is then the of must be , unless the is . If is then the of must be either or , and if is then all its basic members must be . If is then must be a list of datatype definitionss.

The value of consists of the set of s specified directly in the datatype definition unioned with the possibly empty set of of .

The value of consists of the set of s and their values.

The property determines the and of the datatype being defined by imposing constraints which must be satisfied by values and lexical representations.

The property provides some basic information about the datatype being defined: its cardinality, whether an ordering is defined for it by this specification, whether it has upper and lower bounds, and whether it is numeric.

If is the empty set then the type can be used in deriving other types; the explicit values restriction, list and union prevent further derivations of s by , and respectively.; the explicit value extension prevents any derivation of by extension.

The property is only relevant for anonymous type definitions, for which its value is the component in which this type definition appears as the value of a property, e.g. or .

XML Representation of Simple Type Definition Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

The actual value of the namename attribute, if present on the element, otherwise nullabsent The actual value of the targetNamespacetargetNamespace attribute of the parent schemaschema element information item, if present, otherwise absent.

the alternative is chosen

the type definition resolved to by the actual value of the base attribute of , if present, otherwise the type definition corresponding to the among the children of .

the or alternative is chosen

anySimpleType.

A set corresponding to the actual value of the final attribute, if present, otherwise the actual value of the finalDefault attribute of the ancestor schema element information item, if present, otherwise the empty string, as follows: the empty string

the empty set;

#all

{restriction, list, union};

otherwise

a set with members drawn from the set above, each being present or absent depending on whether the string contains an equivalently named space-delimited substring.

Although the finalDefault attribute of schema may include values other than restriction, list or union, those values are ignored in the determination of

A subset of {restriction, extension, list, union}, determined as follows. Let FS be the actual value of the final attribute, if present, otherwise the actual value of the finalDefault attribute of the ancestor schema element, if present, otherwise the empty string. Then the property value is

FS is the empty string

the empty set;

FS is #all

{restriction, extension, list, union};

Consider FS as a space-separated list, and include restriction if restriction is in that list, and similarly for extension, list and union.

the name attribute is present

absent

the parent element information item is

the corresponding

the parent element information item is

the corresponding

the parent element information item is or

the corresponding to the grandparent element information item

(the parent element information item is ),

the grandparent element information item is

the corresponding to the grandparent

(the grandparent element information item is ), the which is the content type of the corresponding to the great-grandparent element information item.

If the alternative is chosen, then list, otherwise if the alternative is chosen, then union, otherwise (the alternative is chosen) the of the . The annotation corresponding to the element information item in the children, if present, otherwise null

the alternative is chosen

a set of components constituting a restriction of the of the with respect to a set of components corresponding to the appropriate element information items among the children of (i.e. those which specify facets, if any), as defined in Schema Component Constraint: Simple Type Restriction (Facets) .

the alternative is chosen

a set with one member, a facet with = collapse and = true.

the empty set

Based on , , and , a set of components, one each as specified in , , and . A sequence of components corresponding to

the element information item in the children, if present;

If the alternative is chosen, then the element information item in the children of the , if present;

If the alternative is chosen, then the element information item in the children of the , if present.

The ancestors of a type definition are its and the ancestors of its . (The ancestors of a T in the type hierarchy are themselves type definitions; they are distinct from the XML elements which may be ancestors, in the XML document hierarchy, of the element which declares T.)

If the is atomic, the following additional property mapping also applies:

From among the ancestors of this , that which corresponds to a datatype.

An electronic commerce schema might define a datatype called SKU (the barcode number that appears on products) from the datatype by supplying a value for the facet.

In this case, SKU is the name of the new datatype, is its and is the facet.

If the is list, the following additional property mappings also apply:

the is anySimpleType

the (a) resolved to by the actual value of the itemType attribute of , or (b) corresponding to the among the children of , whichever is present.

In this case, a element will invariably be present; it will invariably have either an itemType attribute or a child, but not both.

(that is, the is not anySimpleType), the of the .

In this case, a element will invariably be present.

A system might want to store lists of floating point values.

In this case, listOfFloat is the name of the new datatype, is its and is the derivation method.

If the is union, the following additional property mappings also apply:

the is anySimpleType

the sequence of (a) the s (a) resolved to by the items in the actual value of the memberTypes attribute of , if any, and (b) those corresponding to the s among the children of , if any, in order.

In this case, a element will invariably be present; it will invariably have either a memberTypes attribute or one or more children, or both.

(that is, the is not anySimpleType), the of the .

In this case, a element will invariably be present.

Note that the rule just given allows unions to be members of other unions. This is a change from version 1.0 of this specification, which prohibited unions in and replaced any reference to a M, in the XML declaration of a second U, with the members of M. This had the unintended consequence that that if M had facets they were lost, and U erroneously accepted values not accepted by M. In order to correct this error, this version of this specification allows unions in and removes the wording which replaced references to unions with their members.

The XML Schema Working Group solicits input from implementors and users of this specification as to whether this change is an acceptable way of repairing the problem in version 1.0 of this specification, or whether it would be preferable to allow unions as members of other unions only if they have an empty property. If such a change would make this specification more (or less) attractive to users or implementors, please let us know.

As an example, taken from a typical display oriented text markup language, one might want to express font sizes as an integer between 8 and 72, or with one of the tokens "small", "medium" or "large". The below would accomplish that.

<xsd:attribute name="size"> <xsd:simpleType> <xsd:union> <xsd:simpleType> <xsd:restriction base="xsd:positiveInteger"> <xsd:minInclusive value="8"/> <xsd:maxInclusive value="72"/> </xsd:restriction> </xsd:simpleType> <xsd:simpleType> <xsd:restriction base="xsd:NMTOKEN"> <xsd:enumeration value="small"/> <xsd:enumeration value="medium"/> <xsd:enumeration value="large"/> </xsd:restriction> </xsd:simpleType> </xsd:union> </xsd:simpleType> </xsd:attribute> A header this is a test

A datatype can be from a datatype or another derivedan datatype by one of three means: by restriction, by list or by union.

Derivation by restriction The actual value of of The union of the set of components resolved to by the facet children merged with from , subject to the Facet Restriction Valid constraints specified in . The component resolved to by the actual value of the base attribute or the children, whichever is present.

An electronic commerce schema might define a datatype called Sku (the barcode number that appears on products) from the datatype by supplying a value for the facet.

In this case, Sku is the name of the new datatype, is its and is the facet.

Derivation by list list The component resolved to by the actual value of the itemType attribute or the children, whichever is present.

A datatype must be from an or a datatype, known as the of the datatype. This yields a datatype whose is composed of finite-length sequences of values from the of the and whose is composed of space-separated lists of literals of the .

A system might want to store lists of floating point values.

In this case, listOfFloat is the name of the new datatype, is its and is the derivation method.

As mentioned in , when a datatype is derived from a datatype, the following constraining facets can be used:

regardless of the constraining facets that are applicable to the datatype that serves as the of the .

For each of , and , the unit of length is measured in number of list items. The value of is fixed to the value collapse.

Derivation by union union The sequence of components resolved to by the items in the actual value of the memberTypes attribute, if any, in order, followed by the components resolved to by the children, if any, in order. If is union for any components resolved to above, then the is replaced by its .

A datatype can be from one or more , or other datatypes, known as the of that datatype.

As mentioned in , when a datatype is derived from a datatype, the only following constraining facets can be used:

regardless of the constraining facets that are applicable to the datatypes that participate in the

Constraints on XML Representation of Simple Type Definition itemType attribute or simpleType child

Either the itemType attribute or the child of the element must be present, but not both.

base attribute or simpleType child

Either the base attribute or the simpleType child of the element must be present, but not both.

memberTypes attribute or simpleType children

Either the memberTypes attribute of the element must be non-empty or there must be at least one simpleType child.

Simple Type Definition Validation Rules Datatype Valid

A string is datatype-valid with respect to a datatype definition if:

it matches a in the of the datatype, determined as follows:

if is a member of , then the string must be ;

if is not a member of , then

if is then the string must a in the of

if is then the string must be a sequence of space-separated tokens, each of which es a in the of

if is then the string must a in the of at least one member of

the value denoted by the matched in the previous step is a member of the of the datatype, as determined by it being with respect to each member of (except for ).

A is datatype-valid with respect to a if and only if it is a member of the of the corresponding datatype.

Since every value in the is denoted by some , and every in the maps to some value, the requirement that the be in the entails the requirement that the value it maps to should fulfill all of the constraints imposed by the of the datatype. If the datatype is a , the Datatype Valid constraint also entails that each whitespace-delimited token in the list be datatype-valid against the of the list. If the datatype is a , the Datatype Valid constraint entails that the be datatype-valid against at least one of the .

That is, the constraints on s and on datatype derivation defined in this specification have as a consequence that a L is datatype-valid with respect to a T if and only if either T corresponds to a datatype or

If there is a in , then L is with respect to the .

The appropriate case among the following is true:

If the of T is , then L is in the of the of T, as defined in the appropriate section of this specification. Let V be the member of the of the of T mapped to by L, as defined in the appropriate section of this specification.

If the of T is , then each space-delimited substring of L is Datatype Valid with respect to the of T. Let V be the sequence consisting of the values identified by Datatype Valid for each of those substrings, in order.

If the of T is , then L is Datatype Valid with respect to at least one member of the of T. Let B be the of T for L. Let V be the value identified by Datatype Valid for L with respect to B.

V, as determined by the appropriate sub-clause of above, is with respect to each member of the of T which is not a or a facet.

Note that facets do not take part in checking Datatype Valid. In cases where this specification is used in conjunction with schema-validation of XML documents, facets are used to normalize infoset values before the normalized results are checked for datatype validity. In the case of unions the facet to use is the one associated with B in above.

Constraints on Simple Type Definition Schema Components list of atomic

If is , then the of be or .

no circular unions

If is , then it is an if and and of any member of .

Simple Type Definition for anySimpleType

There is a simple type definition nearly equivalent to the simple version of the ur-type definition present in every schema by definition. It has the following properties:

Simple Type Definition of the Ur-Type anySimpleType http://www.w3.org/2001/XMLSchema the ur-type definition The empty set null Built-in Simple Type Definitions

The of is present in every schema. It has the following properties:

Simple type definition of anySimpleType anySimpleType http://www.w3.org/2001/XMLSchema The empty set absent anyType The empty set The empty set absent absent absent absent The empty sequence

The definition of is the root of the Simple Type Definition hierarchy; as such it mediates between the other simple type definitions, which all eventually trace back to it via their properties, and the definition of , which is its .

The of is present in every schema. It has the following properties:

Simple type definition of anyAtomicType anyAtomicType http://www.w3.org/2001/XMLSchema The empty set absent The empty set The empty set atomic absent absent absent The empty sequence

Simple type definitions for all the built-in primitive datatypes, namely , , , , , , , , , , , , , , , , , are present by definition in every schema. All have a very similar structure, with only the , the (which is self-referential), the and in one case the varying from one to the next:

corresponding to the built-in primitive datatypes [as appropriate] http://www.w3.org/2001/XMLSchema The empty set atomic [this itself] {a facet with = collapse and = true in all cases except , which has = preserve and = false} [as appropriate] absent absent absent The empty sequence

Similarly, s for all the built-in datatypes are present by definition in every schema, with properties as specified in and as represented in XML in .

corresponding to the built-in datatypes [as appropriate] http://www.w3.org/2001/XMLSchema [as specified in the appropriate sub-section of ] The empty set [atomic or list, as specified in the appropriate sub-section of ] [if is atomic, then the of the , otherwise absent] [as specified in the appropriate sub-section of ] [as specified in the appropriate sub-section of ] absent if is atomic, then absent, otherwise as specified in the appropriate sub-section of ] absent As shown in the XML representations of the ordinary built-in datatypes in Fundamental Facets

Each fundamental facet is a schema component that provides a limited piece of information about some aspect of each datatype. For example, is a . Most fundamental facets are given a value fixed with each primitive datatype's definition, and this value is not changed by subsequent derivations (even when it would perhaps be reasonable to expect an application to give a more accurate value based on the constraining facets used to define the derivation). The and facets are exceptions to this rule; their values may change as a result of certain derivations.

Schema components are identified by kind. Fundamental is not a kind of component. Each kind of (ordered, bounded, etc.) is a separate kind of schema component.

The term refers to any of the components defined in this section.

A can occur only in the of a , and this is the only place where components occur. Each kind of component occurs (once) in each 's set.

The value of any component can always be calculated from other properties of its . Fundamental facets are not required for schema processing, but some applications use them.

equal

Every supports the notion of equality, with the following rules:

for any a and b in the , either a is equal to b, denoted a = b, or a is not equal to b, denoted a != b

there is no pair a and b from the such that both a = b and a != b

for all a in the , a = a

for any a and b in the , a = b if and only if b = a

for any a, b and c in the , if a = b and b = c, then a = c

for any a and b in the if a = b, then a and b cannot be distinguished (i.e., equality is identity)

the value spaces of all datatypes are disjoint (they do not share any values)

On every datatype, the operation Equal is defined in terms of the equality property of the : for any values a, b drawn from the , Equal(a,b) is true if a = b, and false otherwise.

Note that in consequence of the above:

given A and B where A and B are disjoint, every pair of values a from A and b from B, a != b

two values which are members of the of the same datatype may always be compared with each other

if a datatype T is derived by from A, B, ... then the of T is the union of value spaces of its A, B, .... Some values in the of T are also values in the of A. Other values in the of T will be values in the of B and so on. Values in the of T which are also in the of A can be compared with other values in the of A according to the above rules. Similarly for values of type T and B and all the other .

if a datatype T' is derived by from an atomic datatype T then the of T' is a subset of the of T. Values in the value spaces of T and T' can be compared according to the above rules

if datatypes T' and T'' are derived by from a common atomic ancestor T then the value spaces of T' and T'' may overlap. Values in the value spaces of T' and T'' can be compared according to the above rules

There is no schema component corresponding to the equal .

ordered

An order relation on a is a mathematical relation that imposes a or a on the members of the .

A , and hence a datatype, is said to be ordered if there exists an defined for that .

A partial order is an that is irreflexive, asymmetric and transitive.

A has the following properties:

for no a in the , a < a (irreflexivity)

for all a and b in the , a < b implies not(b < a) (asymmetry)

for all a, b and c in the , a < b and b < c implies a < c (transitivity)

The notation a <> b is used to indicate the case when a != b and neither a b.

When a <> b, a and b are incomparable,otherwise they are comparable.

A total order is an such that for no a and b is it the case that a <> b.

A has all of the properties specified above for , plus the following property:

for all a and b in the , either a < b or b < a or a = b

The fact that this specification does not define an for some datatype does not mean that some other application cannot treat that datatype as being ordered by imposing its own order relation.

provides for:

indicating whether an is defined on a , and if so, whether that is a or a

Some datatypes have a nontrivial order relation associated with their value spaces (see ). (There is always a trivial partial ordering wherein every value pair that is not equal is incomparable, which could be associated with any value space.) The ordered facet value is a "near-boolean": one of false, partial, and total, as prescribed in for datatypes; all datatypes inherit this value without change. The value for a is always false and the value for a is computed as described below.

A false value means no order is prescribed; a total value assures that the prescribed order is a total order; a partial value means that the prescribed order is a partial order, but not (for the primitive type in question) a total order. Derivation of new datatypes from datatypes with partial orders may impose constraints which make the effective ordering either a trivial order or a non-trivial total order, but the value of the facet is not changed to reflect this.

A , and hence a datatype, is said to be ordered if this specification prescribes a non-trivial order for that .

Some of the real-world datatypes which are the basis for those defined herein are ordered in some applications, even though no order is prescribed for schema-processing purposes. For example, is sometimes ordered, and and datatypes from ordered datatypes are sometimes given lexical orderings. They are not ordered for schema-processing purposes.

The ordered Schema Component

depends on , and in the component in which a component appears as a member of .

When is , is inherited from of . For all types is as specified in the table in .

When is , is false.

When is , is partial unless one of the following:

If every member of is derived from a common ancestor other than the simple ur-type, then is the same as that ancestor's ordered facet

If every member of has a of false for the ordered facet, then is false

depends on the owner's , , and .

the owner's is atomic

the is

is as specified in the table in .

is the owner's 's .

the owner's is list

is false.

the owner's is union;

every of the has atomic and has the same

is the same as the component's in that primitive type definition's .

each member of the owner's has an component in its whose is false

is false.

is partial.

bounded

A value u in an U is said to be an inclusive upper bound of a V (where V is a subset of U) if for all v in V, u >= v.

A value u in an U is said to be an exclusive upper bound of a V (where V is a subset of U) if for all v in V, u > v.

A value l in an L is said to be an inclusive lower bound of a V (where V is a subset of L) if for all v in V, l <= v.

A value l in an L is said to be an exclusive lower bound of a V (where V is a subset of L) if for all v in V, l < v.

A datatype is bounded if its has either an or an and either an or an .

provides for:

indicating whether a is

Some ordered datatypes have the property that there is one value greater than or equal to every other value, and another that is less than or equal to every other value. (In the case of datatypes, these two values are not necessarily in the value space of the derived datatype, but they must be in the value space of the primitive datatype from which they have been derived.) The bounded facet value is and is generally true for such bounded datatypes. However, it will remain false when the mechanism for imposing such a bound is difficult to detect, as, for example, when the boundedness occurs because of derivation using a component.

The bounded Schema Component

depends on the owner's , and in the component in which a component appears as a member of .

When the is , is as specified in the table in . Otherwise, when the owner's is atomic, if one of or and one of or are among members of the owner's set, then is true; elseotherwise is false.

When the owner's is list, if or both of and are among , then is true; else is false.

When the owner's is union, if is true for every member of and all members of the owner's set and share a common ancestorall of the owner's basic members have the same , then is true; elseotherwise is false.

cardinality

Every has associated with it the concept of cardinality. Some value spaces are finite, some are countably infinite while still others could conceivably be uncountably infinite (although no defined by this specification is uncountable infinite). A datatype is said to have the cardinality of its .

It is sometimes useful to categorize value spaces (and hence, datatypes) as to their cardinality. There are two significant cases:

value spaces that are finite

value spaces that are countably infinite

provides for:

indicating whether the of a is finite or countably infinite

Every value space has a specific number of members. This number can be characterized as finite or infinite. (Currently there are no datatypes with infinite value spaces larger than countable.) The cardinality facet value is either finite or countably infinite and is generally finite for datatypes with finite value spaces. However, it will remain countably infinite when the mechanism for causing finiteness is difficult to detect, as, for example, when finiteness occurs because of a derivation using a component.

The cardinality Schema Component

depends on the owner's , , and in the component in which a component appears as a member of .

When is and of is finite, then is finite.

When is and of is countably infinite and either of the following conditions are true, then is finite; else is countably infinite:

one of , , is among ,

all of the following are true:

one of or is among

either of the following are true:

is among

is one of , , , , or or any type derived from them

When the is , is as specified in the table in . Otherwise, when the owner's is atomic, is countably infinite unless any of the following conditions are true, in which case is finite:

the owner's 's is finite,

at least one of , , or is a member of the owner's set,

all of the following are true:

one of or is a member of the owner's set

either of the following are true:

is a member of the owner's set

is one of , , , , or

When the owner's is list, if or both of and are among members of the owner's set and the owner's 's is finite then is finite; elseotherwise is countably infinite.

When the parent'sowner's is union, if 's is finite for every member of the owner's set then is finite, elseotherwise is countably infinite.

numeric

A datatype is said to be numeric if its values are conceptually quantities (in some mathematical number system).

A datatype whose values are not is said to be non-numeric.

provides for:

indicating whether a is

Some value spaces are made up of things that are conceptually numeric, others are not. The numeric facet value indicates which are considered numeric.

The numeric Schema Component

depends on the owner's , , and in the component in which a component appears as a member of .

When the is , is as specified in the table in . Otherwise, when the owner's is atomic, is inherited from the owner's 's of . For all types is as specified in the table in .

When the owner's is list, is false.

When the owner's is union, if 's is true for every member of the owner's set then is true, elseotherwise is false.

Constraining Facets

constraining facets are schema components whose values may be set or changed during derivation (subject to facet-specific controls) to control various aspects of the derived datatype. For example, is a constraining facet. Constraining Facets are given a value as part of the derivation when an datatype is defined by restricting a or datatype; a few constraining facets have default values that are also provided for datatypes.

Schema components are identified by kind. Constraining is not a kind of component. Each kind of constraining facet (whiteSpace, length, etc.) is a separate kind of schema component.

The term refers to any of the components defined in this section.

length

length is the number of units of length, where units of length varies depending on the type that is being derived from. The value of length be a .

For and datatypes derived from , length is measured in units of characters as defined in . For , length is measured in units of characters (as for ). For and and datatypes derived from them, length is measured in octets (8 bits) of binary data. For datatypes by , length is measured in number of list items.

For and datatypes derived from , length will not always coincide with "string length" as perceived by some users or with the number of storage units in some digital representation. Therefore, care should be taken when specifying a value for length and in attempting to infer storage requirements from a given value for length.

provides for:

Constraining a to values with a specific number of units of length, where units of length varies depending on .

The following is the definition of a datatype to represent product codes which must be exactly 8 characters in length. By fixing the value of the length facet we ensure that types derived from productCode can change or set the values of other facets, such as pattern, but cannot change the length.

<simpleType name='productCode'> <restriction base='string'> <length value='8' fixed='true'/> </restriction> </simpleType> The length Schema Component

If is true, then types for which the current type is the cannot specify a value for other than .

XML Representation of length Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

The actual value of the value attribute The actual value of the fixed attribute, if present, otherwise false The annotations corresponding to all the element information items in the children, if any. length Validation Rules Length Valid

A value in a is facet-valid with respect to , determined as follows: if and only if:

if the is then

if is or , then the length of the value, as measured in characters be equal to ;

if is or , then the length of the value, as measured in octets of the binary data, be equal to ;

if is or , then any is facet-valid.

if the is , then the length of the value, as measured in list items, be equal to

The use of on datatypes derived from and , , and datatypes derived from them is deprecated. Future versions of this specification may remove this facet for these datatypes.

Constraints on length Schema Components length and minLength or maxLength

If is a member of then

It is an error for to be a member of unless

the of <= the of and

there is type definition from which this one is derived by one or more restriction steps in which has the same and is not specified.

It is an error for to be a member of unless

the of <= the of and

there is type definition from which this one is derived by one or more restriction steps in which has the same and is not specified.

length valid restriction

It is an if is among the members of of and is not equal to the of the parent .

minLength

minLength is the minimum number of units of length, where units of length varies depending on the type that is being derived from. The value of minLength be a .

For and datatypes derived from , minLength is measured in units of characters as defined in . For and and datatypes derived from them, minLength is measured in octets (8 bits) of binary data. For datatypes by , minLength is measured in number of list items.

For and datatypes derived from , minLength will not always coincide with "string length" as perceived by some users or with the number of storage units in some digital representation. Therefore, care should be taken when specifying a value for minLength and in attempting to infer storage requirements from a given value for minLength.

provides for:

Constraining a to values with at least a specific number of units of length, where units of length varies depending on .

The following is the definition of a datatype which requires strings to have at least one character (i.e., the empty string is not in the of this datatype).

<simpleType name='non-empty-string'> <restriction base='string'> <minLength value='1'/> </restriction> </simpleType> The minLength Schema Component

If is true, then types for which the current type is the cannot specify a value for other than .

XML Representation of minLength Schema Component

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

A value in a is facet-valid with respect to , determined as follows:

if the is then

if is or , then the length of the value, as measured in characters be greater than or equal to ;

if is or , then the length of the value, as measured in octets of the binary data, be greater than or equal to ;

if is or , then any is facet-valid.

if the is , then the length of the value, as measured in list items, be greater than or equal to

The use of on datatypes derived from and , , and datatypes derived from them is deprecated. Future versions of this specification may remove this facet for these datatypes.

Constraints on minLength Schema Components minLength <= maxLength

If both and are members of , then the of be less than or equal to the of .

minLength valid restriction

It is an if is among the members of of and is less than the of the parent .

maxLength

maxLength is the maximum number of units of length, where units of length varies depending on the type that is being derived from. The value of maxLength be a .

For and datatypes derived from , maxLength is measured in units of characters as defined in . For and and datatypes derived from them, maxLength is measured in octets (8 bits) of binary data. For datatypes by , maxLength is measured in number of list items.

For and datatypes derived from , maxLength will not always coincide with "string length" as perceived by some users or with the number of storage units in some digital representation. Therefore, care should be taken when specifying a value for maxLength and in attempting to infer storage requirements from a given value for maxLength.

provides for:

Constraining a to values with at most a specific number of units of length, where units of length varies depending on .

The following is the definition of a datatype which might be used to accept form input with an upper limit to the number of characters that are acceptable.

<simpleType name='form-input'> <restriction base='string'> <maxLength value='50'/> </restriction> </simpleType> The maxLength Schema Component

If is true, then types for which the current type is the cannot specify a value for other than .

XML Representation of maxLength Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

A value in a is facet-valid with respect to , determined as follows:

if the is then

if is or , then the length of the value, as measured in characters be less than or equal to ;

if is or , then the length of the value, as measured in octets of the binary data, be less than or equal to ;

if is or , then any is facet-valid.

if the is , then the length of the value, as measured in list items, be less than or equal to

The use of on datatypes derived from and , , and datatypes derived from them is deprecated. Future versions of this specification may remove this facet for these datatypes.

Constraints on maxLength Schema Components maxLength valid restriction

It is an if is among the members of of and is greater than the of the parent .

pattern

pattern is a constraint on the of a datatype which is achieved by constraining the to literals which match a specific patterneach member of a set of patterns. The value of pattern must be a set of regular expressions.

provides for:

Constraining a to values that are denoted by literals which match a specific each of a set of regular expressions.

The following is the definition of a datatype which is a better representation of postal codes in the United States, by limiting strings to those which are matched by a specific .

<simpleType name='better-us-zipcode'> <restriction base='string'> <pattern value='[0-9]{5}(-[0-9]{4})?'/> </restriction> </simpleType> The pattern Schema Component XML Representation of pattern Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

be a valid . The actual value of the value attribute. Let R be a regular expression given by

there is only one among the children of a

the actual value of its value attribute

the concatenation of the actual values of all the children's value attributes, in order, separated by |, so forming a single regular expression with multiple branches.

The value is then given by

the of the has a facet among its

the union of that facet's and {R}

just {R}

The annotations corresponding to all the element information items in the children, if any. A (possibly empty) sequence of components, one for each among the children of the s among the children of a , in order.

The property will only have more than one member when involves a facet at more than one step in a type derivation. During validation, lexical forms will be checked against every member of the resulting , effectively creating a conjunction of patterns.

It is a consequence of the schema representation constraint and of the rules for thatIn summary, facets specified on the same step in a type derivation are ORed together, while facets specified on different steps of a type derivation are ANDed together.

Thus, to impose two constraints simultaneously, schema authors may either write a single which expresses the intersection of the two s they wish to impose, or define each on a separate type derivation step.

Constraints on XML Representation of pattern Pattern value

The actual value of the value attribute must be a as defined in .

Multiple patterns

If multiple element information items appear as children of a , the normalized values should be combined as if they appeared in a single as separate branches.

It is a consequence of the schema representation constraint and of the rules for restriction that pattern facets specified on the same step in a type derivation are ORed together, while pattern facets specified on different steps of a type derivation are ANDed together.

Thus, to impose two pattern constraints simultaneously, schema authors may either write a single pattern which expresses the intersection of the two patterns they wish to impose, or define each pattern on a separate type derivation step.

pattern Validation Rules pattern valid

A in a is facet-valid with respect to if and only if for each in its , the is among the set of character sequences denoted by the specified in .

As noted in , certain uses of the facet may eliminate from the lexical space the canonical forms of some values in the value space; this can be inconvenient for applications which write out the canonical form of a value and rely on being able to read it in again as a legal lexical form. This specification provides no recourse in such situations; applications are free to deal with it as they see fit. Caution is advised.

enumeration

enumeration constrains the to a specified set of values.

enumeration does not impose an order relation on the it creates; the value of the property of the derived datatype remains that of the datatype from which it is derived.

provides for:

Constraining a to a specified set of values.

The following example is a datatype definition for a datatype which limits the values of dates to the three US holidays enumerated. This datatype definition would appear in a schema authored by an "end-user" and shows how to define a datatype by enumerating the values in its . The enumerated values must be type-valid literals for the .

<simpleType name='holidays'> <annotation> <documentation>some US holidays</documentation> </annotation> <restriction base='gMonthDay'> <enumeration value='--01-01'> <annotation> <documentation>New Year's day</documentation> </annotation> </enumeration> <enumeration value='--07-04'> <annotation> <documentation>4th of July</documentation> </annotation> </enumeration> <enumeration value='--12-25'> <annotation> <documentation>Christmas</documentation> </annotation> </enumeration> </restriction> </simpleType> The enumeration Schema Component XML Representation of enumeration Schema Components

The XML representation for an schema component is an element information item. The correspondences between the properties of the information item and properties of the component are as follows:

be in the of . The actual value of the value attribute.

there is only one among the children of a

a set with one member, the actual value of its value attribute.

a set of the actual values of all the children's value attributes.

The normalized value of the value attribute must be with respect to the of the corresponding to the nearest ancestor element.

Multiple enumerations

If multiple element information items appear as children of a the of the component should be the set of all such normalized values..

enumeration Validation Rules enumeration valid

A value in a is facet-valid with respect to if and only if the value is one of the values specified in .

Constraints on enumeration Schema Components enumeration valid restriction

It is an if any member of is not in the of .

whiteSpace

whiteSpace constrains the of types derived from such that the various behaviors specified in Attribute Value Normalization in are realized. The value of whiteSpace must be one of {preserve, replace, collapse}.

preserve

No normalization is done, the value is not changed (this is the behavior required by for element content)

replace

All occurrences of #x9 (tab), #xA (line feed) and #xD (carriage return) are replaced with #x20 (space)

collapse

After the processing implied by replace, contiguous sequences of #x20's are collapsed to a single #x20, and leading and trailing #x20's are removed.

The notation #xA used here (and elsewhere in this specification) represents the Universal Character Set (UCS) code point hexadecimal A (line feed), which is denoted by U+000A. This notation is to be distinguished from 
, which is the XML character reference to that same UCS code point.

whiteSpace is applicable to all and datatypes. For all datatypes other than (and types derived by from it) the value of whiteSpace is collapse and cannot be changed by a schema author; for the value of whiteSpace is preserve; for any type derived by from the value of whiteSpace can be any of the three legal values. For all datatypes by the value of whiteSpace is collapse and cannot be changed by a schema author. For all datatypes by whiteSpace does not apply directly; however, the normalization behavior of types is controlled by the value of whiteSpace on that one of the basic members against which the is successfully validated.

For more information on whiteSpace, see the discussion on white space normalization in Schema Component Details in .

provides for:

Constraining a according to the white space normalization rules.

The following example is the datatype definition for the derived datatype.

The values replace and collapse may appear to provide a convenient way to unwrap text (i.e. undo the effects of pretty-printing and word-wrapping). In some cases, especially highly constrained data consisting of lists of artificial tokens such as part numbers or other identifiers, this appearance is correct. For natural-language data, however, the whitespace processing prescribed for these values is not only unreliable but will systematically remove the information needed to perform unwrapping correctly. For Asian scripts, for example, a correct unwrapping process will replace line boundaries not with blanks but with zero-width separators or nothing. In consequence, it is normally unwise to use these values for natural-language data, or for any data other than lists of highly constrained tokens.

The whiteSpace Schema Component

If is true, then types for which the current type is the cannot specify a value for other than .

XML Representation of whiteSpace Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

There are no s associated . For more information, see the discussion on white space normalization in Schema Component Details in .

Constraints on whiteSpace Schema Components whiteSpace valid restriction

It is an if is among the members of of and any of the following conditions is true:

is replace or preserve and the of the parent is collapse

is preserve and the of the parent is replace

maxInclusive

maxInclusive is the inclusive upper boundinclusive upper bound of the for a datatype with the property. The value of maxInclusive be equal to some value in the of the .

provides for:

Constraining a to values with a specific inclusive upper boundinclusive upper bound.

The following is the definition of a datatype which limits values to integers less than or equal to 100, using .

<simpleType name='one-hundred-or-less'> <restriction base='integer'> <maxInclusive value='100'/> </restriction> </simpleType> The maxInclusive Schema Component

If is true, then types for which the current type is the cannot specify a value for other than .

XML Representation of maxInclusive Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

be equal to some value in the of . The actual value of the value attribute The actual value of the fixed attribute, if present, otherwise false , if present, otherwise false The annotations corresponding to all the element information items in the children, if any. maxInclusive Validation Rules maxInclusive Valid

A value in an is facet-valid with respect to , determined as follows:

if the numeric property in property of the component in is true, then the value be numerically less than or equal to ;

if the numeric property in property of the component in is false (i.e., is one of the date and time related datatypes), then the value be chronologically less than or equal to ;

Constraints on maxInclusive Schema Components minInclusive <= maxInclusive

It is an for the value specified for to be greater than the value specified for for the same datatype.

maxInclusive valid restriction

It is an if any of the following conditions is true:

is among the members of of and is greater than the of the parentthat .

is among the members of of and is greater than or equal to the of the parentthat .

is among the members of of and is less than the of the parentthat .

is among the members of of and is less than or equal to the of the parentthat .

maxExclusive

maxExclusive is the exclusive upper bound exclusive upper bound of the for a datatype with the property. The value of maxExclusive be equal to some value in the of the or be equal to in .

provides for:

Constraining a to values with a specific exclusive upper boundexclusive upper bound.

The following is the definition of a datatype which limits values to integers less than or equal to 100, using .

Note that the of this datatype is identical to the previous one (named 'one-hundred-or-less').

The maxExclusive Schema Component

If is true, then types for which the current type is the cannot specify a value for other than .

XML Representation of maxExclusive Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

be equal to some value in the of . The actual value of the value attribute The actual value of the fixed attribute, if present, otherwise false The annotations corresponding to all the element information items in the children, if any. maxExclusive Validation Rules maxExclusive Valid

A value in an is facet-valid with respect to , determined as follows:

if the numeric property in property of the component in is true, then the value be numerically less than ;

if the numeric property in property of the component in is false (i.e., is one of the date and time related datatypes), then the value be chronologically less than ;

Constraints on maxExclusive Schema Components maxInclusive and maxExclusive

It is an for both and to be specified in the same derivation step of a datatype definition.

minExclusive <= maxExclusive

It is an for the value specified for to be greater than the value specified for for the same datatype.

maxExclusive valid restriction

It is an if any of the following conditions is true:

is among the members of of and is greater than the of the parentthat .

is among the members of of and is less than or equal to the of the parentthat .

minExclusive

minExclusive is the exclusive lower bound exclusive lower bound of the for a datatype with the property. The value of minExclusive be equal to some value in the of the or be equal to in .

provides for:

Constraining a to values with a specific exclusive lower boundexclusive lower bound.

The following is the definition of a datatype which limits values to integers greater than or equal to 100, using .

Note that the of this datatype is identical to the previousfollowing one (named 'one-hundred-or-more').

The minExclusive Schema Component

If is true, then types for which the current type is the cannot specify a value for other than .

XML Representation of minExclusive Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

A value in an is facet-valid with respect to if and only if:

if the numeric property in property of the component in is true, then the value be numerically greater than ;

if the numeric property in property of the component in is false (i.e., is one of the date and time related datatypes), then the value be chronologically greater than ;

Constraints on minExclusive Schema Components minInclusive and minExclusive

It is an for both and to be specified for the same datatypein the same derivation step of a .

minExclusive < maxInclusive

It is an for the value specified for to be greater than or equal to the value specified for for the same datatype.

minExclusive valid restriction

It is an if any of the following conditions is true:

is among the members of of and is less than the of the parentthat .

is among the members of of and is greater than or equal to the of the parentthat .

minInclusive

minInclusive is the inclusive lower bound inclusive lower bound of the for a datatype with the property. The value of minInclusive be equal to some value in the of the .

provides for:

Constraining a to values with a specific inclusive lower boundinclusive lower bound.

The following is the definition of a datatype which limits values to integers greater than or equal to 100, using .

<simpleType name='one-hundred-or-more'> <restriction base='integer'> <minInclusive value='100'/> </restriction> </simpleType> The minInclusive Schema Component

If is true, then types for which the current type is the cannot specify a value for other than .

XML Representation of minInclusive Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

A value in an is facet-valid with respect to if and only if:

if the numeric property in property of the component in is true, then the value be numerically greater than or equal to ;

if the numeric property in property of the component in is false (i.e., is one of the date and time related datatypes), then the value be chronologically greater than or equal to ;

Constraints on minInclusive Schema Components minInclusive < maxExclusive

It is an for the value specified for to be greater than or equal to the value specified for for the same datatype.

minInclusive valid restriction

It is an if any of the following conditions is true:

is among the members of of and is less than the of the parentthat .

is among the members of of and is greater the of the parentthat .

is among the members of of and is less than or equal to the of the parentthat .

is among the members of of and is greater than or equal to the of the parentthat .

totalDigits

totalDigits controls the maximum number of values in the of datatypes derived from , by restricting it to numbers that are expressible as i × 10^-n where i and n are integers such that |i| < 10^totalDigits and 0 <= n <= totalDigits. The value of totalDigits be a .

totalDigits restricts the magnitude and of values in the value spaces of and and datatypes derived from them. The effect must be described separately for the two primitive types.

For , if the of is t, the effect is to require that values be equal to i / 10ⁿ, for some integers i and n, with | i | < 10^t and 0 ≤ n ≤ t. This has as a consequence that the values are expressible using at most t digits in decimal notation.

For , values with of nV and of aP, if the of is t, the effect is to require that (aP + 1 + log₁₀(| nV |) 1) ≤ t, for values other than zero, NaN, and the infinities. This means in effect that values are expressible in scientific notation using at most t digits for the coefficient.

The of must be a .

The term totalDigits is chosen to reflect the fact that it restricts the to those values that can be represented lexically using at most totalDigits digits in decimal notation, or at most totalDigits digits for the coefficient, in scientific notation. Note that it does not restrict the directly; a lexical representation that adds additional leading zero digits or trailing fractionalnon-significant leading or trailing zero digits is still permitted. It also has no effect on the values NaN, INF, and -INF.

The totalDigits Schema Component

If is true, then types for which the current type is the cannot must not specify a value for other than .

XML Representation of totalDigits Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

A value in a is facet-valid with respect to if:

that value is expressible as i × 10^-n where i and n are integers such that |i| < 10^ and 0 <= n <= .

A value v is facet-valid with respect to a facet with a of t if and only if one of the following is true:

v is a value with of positiveInfinity, negativeInfinity, notANumber, or zero.

v is a value with of nV and of aP, and is not NaN, INF, -INF, or zero, and (aP + 1 + log₁₀(| nV |) 1) ≤ t.

v is a value equal to i / 10ⁿ, for some integers i and n, with | i | < 10^t and 0 ≤ n ≤ t.

Constraints on totalDigits Schema Components totalDigits valid restriction

It is an if is among the members of of and is greater than the of the parent .

It is an if the 's has a facet among its and is greater than the of that facet.

fractionDigits

fractionDigits controls the size of the minimum difference between values in the of datatypes derived from decimal, by restricting the to numbers that are expressible as i × 10^-n where i and n are integers and 0 <= n <= fractionDigits.places an upper limit on the of values: if the of fractionDigits = f, then the value space is restricted to values equal to i / 10ⁿ for some integers i and n and 0 ≤ n ≤ f. The value of fractionDigits be a

The term fractionDigits is chosen to reflect the fact that it restricts the to those values that can be represented lexically in decimal notation using at most fractionDigits to the right of the decimal point. Note that it does not restrict the directly; a non-lexical representation that adds additional leading zero digits or non-significant trailing fractionalnon-significant leading or trailing zero digits is still permitted.

The following is the definition of a datatype which could be used to represent the magnitude of a person's body temperature on the Celsius scale. This definition would appear in a schema authored by an "end-user" and shows how to define a datatype by specifying facet values which constrain the range of the .

<simpleType name='celsiusBodyTemp'> <restriction base='decimal'> <totalDigits value='4'/> <fractionDigits value='1'/> <minInclusive value='36.4'/> <maxInclusive value='40.5'/> </restriction> </simpleType> The fractionDigits Schema Component

If is true, then types for which the current type is the cannotmust not specify a value for other than .

XML Representation of fractionDigits Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

A value in a is facet-valid with respect to if and only if that value is expressible as i × 10^-n where i and n are integers and 0 <= n <= . that value is equal to i / 10ⁿ for integer i and n, with 0 ≤ n ≤ .

Constraints on fractionDigits Schema Components fractionDigits less than or equal to totalDigits

It is an for the of to be greater than that of .

fractionDigits valid restriction

It is an if is among the members of of and is greater than the of the parentthat .

maxScale

maxScale places an upper limit on the of values: if the of maxScale = m, then only values with ≤ m are retained in the . As a consequence, every value in the value space will have equal to i / 10ⁿ for some integers i and n, with n ≤ m. The of must be an . If it is negative, the numeric values of the datatype are restricted to multiples of 10 (or 100, or …).

The term maxScale is chosen to reflect the fact that it restricts the to those values that can be represented lexically in scientific notation using an integer coefficient and a scale (or negative exponent) no greater than . (It has nothing to do with the use of the term scale to denote the radix or base of a notation.) Note that does not restrict the directly; a lexical representation that adds non-significant leading or trailing zero digits, or that uses a lower exponent with a non-integer coefficient is still permitted.

The following is the definition of a user-defined datatype which could be used to represent a floating-point decimal datatype which allows seven decimal digits for the coefficient and exponents between −95 and 96. Note that the scale is −1 times the exponent.

<simpleType name='decimal32'> <restriction base='precisionDecimal'> <totalDigits value='7'/> <maxScale value='95'/> <minScale value='-96'/> </restriction> </simpleType> The maxScale Schema Component

If is true, then types for which the current type is the must not specify a value for other than .

XML Representation of maxScale Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

A value v is facet-valid with respect to if and only if one of the following is true:

v has less than or equal to the of .

The of v is absent.

Constraints on maxScale Schema Components maxScale valid restriction

It is an if is among the members of of and is greater than the of that .

minScale

minScale places a lower limit on the of values. If the of minScale is m, then the value space is restricted to values with ≥ m. As a consequence, every value in the value space will have equal to i / 10ⁿ for some integers i and n, with n ≥ m.

The term minScale is chosen to reflect the fact that it restricts the to those values that can be represented lexically in exponential form using an integer coefficient and a scale (negative exponent) at least as large as minScale. Note that it does not restrict the directly; a lexical representation that adds additional leading zero digits, or that uses a larger exponent (and a correspondingly smaller coefficient) is still permitted.

The following is the definition of a user-defined datatype which could be used to represent amounts in a decimal currency; it corresponds to a SQL column definition of DECIMAL(8,2). The effect is to allow values between -999,999.99 and 999,999.99, with a fixed interval of 0.01 between values.

<simpleType name='price'> <restriction base='precisionDecimal'> <totalDigits value='8'/> <minScale value='2'/> <maxScale value='2'/> </restriction> </simpleType> The minScale Schema Component

If is true, then types for which the current type is the must not specify a value for other than .

XML Representation of minScale Schema Components

The XML representation for a schema component is a element information item. The correspondences between the properties of the information item and properties of the component are as follows:

A value v is facet-valid with respect to if and only if one of the following is true:

v has greater than or equal to the of .

The of v is absent.

Constraints on minScale Schema Components minScale less than or equal to maxScale

It is an for to be greater than .

Note that it is not an error for to be greater than .

minScale valid restriction

It is an if is among the members of of and is less than the of that .

Conformance

This specification describes two levels of conformance for datatype processors. The first is required of all processors. Support for the other will depend on the application environments for which the processor is intended.

Minimally conforming processors completely and correctly implement the and .

Processors which accept schemas in the form of XML documents as described in (and other relevant portions of ) are additionally said to provide conformance to the XML Representation of Schemas, and , when processing schema documents, completely and correctly implement all s in this specification, and adhere exactly to the specifications in (and other relevant portions of ) for mapping the contents of such documents to schema components for use in validation.

By separating the conformance requirements relating to the concrete syntax of XML schema documents, this specification admits processors which validate using schemas stored in optimized binary representations, dynamically created schemas represented as programming language data structures, or implementations in which particular schemas are compiled into executable code such as C or Java. Such processors can be said to be but not necessarily in conformance to the XML Representation of Schemas.

Partial Implementation of Infinite Datatypes

Some datatypes defined in this specification have infinite value spaces; no finite implementation can completely handle all their possible values. For some such datatypes, minimum implementation limits are specified below. For other infinite types such as , , and , no minimum implementation limits are specified.

When this specification is used in the context of other languages (as it is, for example, by ), the host language may specify other minimum implementation limits.

When presented with a literal or value exceeding the capacity of its partial implementation of a datatype, a minimally conforming implementation of this specification will sometimes be unable to determine with certainty whether the value is datatype-valid or not. Sometimes it will be unable to represent the value correctly through its interface to any downsteam application.

When either of these is so, a conforming processor must indicate to the user and/or downstream application that it cannot process the input data with assured correctness (much as it would indicate if it ran out of memory). When the datatype validity of a value or literal is uncertain because it exceeds the capacity of a partial implementation, the literal or value must not be treated as invalid, and the unsupported value must not be quietly changed to a supported value.

This specification does not constrain the method used to indicate that a literal or value in the input data has exceeded the capacity of the implementation, or the form such indications take.

Minimally conforming processors which set an application- or implementation-defined limit on the size of the values supported must clearly document that limit.

These are the partial-implementation minimal conformance requirements:

All processors must support values whose absolute value is less than 10¹⁶ (i.e., those expressible with sixteen total digits).

All processors must support nonnegative values less than 10000 (i.e., those expressible with four digits).

All processors must support values to milliseconds (i.e. those expressible with three fraction digits).

All processors must support fractional-second values to milliseconds (i.e. those expressible with three fraction digits).

All processors must support values with from -2,000,000,000 to 2,000,000,000 months and from -2,000,000 to 2,000,000 seconds.

All processors must support all values in the of the otherwise unconstrained derived datatype for which is set to sixteen, to 369, and to −398.

The conformance limits given in the text correspond to those of the decimal64 type defined in the current draft of IEEE 754R, which can be stored in a 64-bit field. The XML Schema Working Group recommends that implementors support limits corresponding to those of the decimal128 type. This entails supporting the values in the value space of the otherwise unconstrained datatype for which is set to 34, to 6176, and to −6111.

The XML Schema Working Group requests feedback from implementors and users of XML Schema concerning the minimum and recommended implementation limits for . If other limits, larger or smaller, would make this datatype more attractive to users or implementors, please let us know.

Schema for Schema Documents (Datatypes) Datatype Definitions (normative)

The XML representation of the datatypes-relevant part of the schema for schema documents is presented here as a normative part of the specification.

Like any other XML document, schema documents may carry XML and document type declarations. An XML declaration and a document type declaration are provided here for convenience. Since this schema document describes the XML Schema language, the targetNamespace attribute on the schema element refers to the XML Schema namespace itself.

Schema documents conforming to this specification may be in XML 1.0 or XML 1.1. Conforming implementations may accept input in XML 1.0 or XML 1.1 or both. See .

Schema for Schema Documents (Datatypes) <?xml version='1.0'?> <!DOCTYPE xs:schema PUBLIC "-//W3C//DTD XMLSCHEMA 200102//EN" "XMLSchema.dtd" [  <!ENTITY % schemaAttrs 'xmlns:hfp CDATA #IMPLIED'> <!ELEMENT hfp:hasFacet EMPTY> <!ATTLIST hfp:hasFacet name NMTOKEN #REQUIRED> <!ELEMENT hfp:hasProperty EMPTY> <!ATTLIST hfp:hasProperty name NMTOKEN #REQUIRED value CDATA #REQUIRED>  <!ATTLIST xs:simpleType id ID #IMPLIED> <!ATTLIST xs:maxExclusive id ID #IMPLIED> <!ATTLIST xs:minExclusive id ID #IMPLIED> <!ATTLIST xs:maxInclusive id ID #IMPLIED> <!ATTLIST xs:minInclusive id ID #IMPLIED> <!ATTLIST xs:totalDigits id ID #IMPLIED> <!ATTLIST xs:fractionDigits id ID #IMPLIED> <!ATTLIST xs:maxScale id ID #IMPLIED> <!ATTLIST xs:minScale id ID #IMPLIED> <!ATTLIST xs:length id ID #IMPLIED> <!ATTLIST xs:minLength id ID #IMPLIED> <!ATTLIST xs:maxLength id ID #IMPLIED> <!ATTLIST xs:enumeration id ID #IMPLIED> <!ATTLIST xs:pattern id ID #IMPLIED> <!ATTLIST xs:appinfo id ID #IMPLIED> <!ATTLIST xs:documentation id ID #IMPLIED> <!ATTLIST xs:list id ID #IMPLIED> <!ATTLIST xs:union id ID #IMPLIED> ]> <?xml version='1.0'?> <xs:schema xmlns:hfp="http://www.w3.org/2001/XMLSchema-hasFacetAndProperty" xmlns:xs="http://www.w3.org/2001/XMLSchema" blockDefault="#all" elementFormDefault="qualified" xml:lang="en" targetNamespace="http://www.w3.org/2001/XMLSchema" version="Id: datatypes.xsd,v 1.4 2004/05/29 10:26:33 ht Exp datatypes.xsd (wd-20060831)"> <xs:annotation> <xs:documentation source="../datatypes/datatypes.html"> The schema corresponding to this document is normative, with respect to the syntactic constraints it expresses in the XML Schema language. The documentation (within <documentation> elements) below, is not normative, but rather highlights important aspects of the W3C Recommendation of which this is a part </xs:documentation> </xs:annotation> <xs:annotation> <xs:documentation> First the built-in primitive datatypes. These definitions are for information only, the real built-in definitions are magic. </xs:documentation> <xs:documentation> For each built-in datatype in this schema (both primitive and derived) can be uniquely addressed via a URI constructed as follows: 1) the base URI is the URI of the XML Schema namespace 2) the fragment identifier is the name of the datatype For example, to address the int datatype, the URI is: http://www.w3.org/2001/XMLSchema#int Additionally, each facet definition element can be uniquely addressed via a URI constructed as follows: 1) the base URI is the URI of the XML Schema namespace 2) the fragment identifier is the name of the facet For example, to address the maxInclusive facet, the URI is: http://www.w3.org/2001/XMLSchema#maxInclusive Additionally, each facet usage in a built-in datatype definition can be uniquely addressed via a URI constructed as follows: 1) the base URI is the URI of the XML Schema namespace 2) the fragment identifier is the name of the datatype, followed by a period (".") followed by the name of the facet For example, to address the usage of the maxInclusive facet in the definition of int, the URI is: http://www.w3.org/2001/XMLSchema#int.maxInclusive </xs:documentation> </xs:annotation> <xs:simpleType name="string" id="string"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#string"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace value="preserve" id="string.preserve"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="boolean" id="boolean"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="finite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#boolean"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="boolean.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="float" id="float"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="true"/> <hfp:hasProperty name="cardinality" value="finite"/> <hfp:hasProperty name="numeric" value="true"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#float"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="float.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="double" id="double"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="true"/> <hfp:hasProperty name="cardinality" value="finite"/> <hfp:hasProperty name="numeric" value="true"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#double"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="double.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="decimal" id="decimal"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="totalDigits"/> <hfp:hasFacet name="fractionDigits"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="total"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="true"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#decimal"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="decimal.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="duration" id="duration"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#duration"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="duration.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="dateTime" id="dateTime"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#dateTime"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="dateTime.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="time" id="time"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#time"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="time.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="date" id="date"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#date"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="date.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="gYearMonth" id="gYearMonth"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#gYearMonth"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="gYearMonth.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="gYear" id="gYear"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#gYear"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="gYear.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="gMonthDay" id="gMonthDay"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#gMonthDay"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="gMonthDay.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="gDay" id="gDay"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#gDay"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="gDay.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="gMonth" id="gMonth"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#gMonth"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="gMonth.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="hexBinary" id="hexBinary"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#binary"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="hexBinary.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="base64Binary" id="base64Binary"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#base64Binary"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="base64Binary.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="anyURI" id="anyURI"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#anyURI"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="anyURI.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="QName" id="QName"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#QName"/> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="QName.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="NOTATION" id="NOTATION"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#NOTATION"/> <xs:documentation> NOTATION cannot be used directly in a schema; rather a type must be derived from it by specifying at least one enumeration facet whose value is the name of a NOTATION declared in the schema. </xs:documentation> </xs:annotation> <xs:restriction base="xs:anySimpleType"> <xs:whiteSpace fixed="true" value="collapse" id="NOTATION.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:annotation> <xs:documentation> Now the derived primitive types </xs:documentation> </xs:annotation> <xs:simpleType name="normalizedString" id="normalizedString"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#normalizedString"/> </xs:annotation> <xs:restriction base="xs:string"> <xs:whiteSpace value="replace" id="normalizedString.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="token" id="token"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#token"/> </xs:annotation> <xs:restriction base="xs:normalizedString"> <xs:whiteSpace value="collapse" id="token.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="language" id="language"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#language"/> </xs:annotation> <xs:restriction base="xs:token"> <xs:pattern value="[a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})*" id="language.pattern"> <xs:annotation> <xs:documentation source="http://www.ietf.org/rfc/rfc3066.txt"> pattern specifies the content of section 2.12 of XML 1.0e2 and RFC 3066 (Revised version of RFC 1766). </xs:documentation> </xs:annotation> </xs:pattern> </xs:restriction> </xs:simpleType> <xs:simpleType name="IDREFS" id="IDREFS"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="pattern"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#IDREFS"/> </xs:annotation> <xs:restriction> <xs:simpleType> <xs:list itemType="xs:IDREF"/> </xs:simpleType> <xs:minLength value="1" id="IDREFS.minLength"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="ENTITIES" id="ENTITIES"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="pattern"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#ENTITIES"/> </xs:annotation> <xs:restriction> <xs:simpleType> <xs:list itemType="xs:ENTITY"/> </xs:simpleType> <xs:minLength value="1" id="ENTITIES.minLength"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="NMTOKEN" id="NMTOKEN"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#NMTOKEN"/> </xs:annotation> <xs:restriction base="xs:token"> <xs:pattern value="\c+" id="NMTOKEN.pattern"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/REC-xml#NT-Nmtoken"> pattern matches production 7 from the XML spec </xs:documentation> </xs:annotation> </xs:pattern> </xs:restriction> </xs:simpleType> <xs:simpleType name="NMTOKENS" id="NMTOKENS"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="pattern"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#NMTOKENS"/> </xs:annotation> <xs:restriction> <xs:simpleType> <xs:list itemType="xs:NMTOKEN"/> </xs:simpleType> <xs:minLength value="1" id="NMTOKENS.minLength"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="Name" id="Name"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#Name"/> </xs:annotation> <xs:restriction base="xs:token"> <xs:pattern value="\i\c*" id="Name.pattern"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/REC-xml#NT-Name"> pattern matches production 5 from the XML spec </xs:documentation> </xs:annotation> </xs:pattern> </xs:restriction> </xs:simpleType> <xs:simpleType name="NCName" id="NCName"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#NCName"/> </xs:annotation> <xs:restriction base="xs:Name"> <xs:pattern value="[\i-[:]][\c-[:]]*" id="NCName.pattern"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/REC-xml-names/#NT-NCName"> pattern matches production 4 from the Namespaces in XML spec </xs:documentation> </xs:annotation> </xs:pattern> </xs:restriction> </xs:simpleType> <xs:simpleType name="ID" id="ID"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#ID"/> </xs:annotation> <xs:restriction base="xs:NCName"/> </xs:simpleType> <xs:simpleType name="IDREF" id="IDREF"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#IDREF"/> </xs:annotation> <xs:restriction base="xs:NCName"/> </xs:simpleType> <xs:simpleType name="ENTITY" id="ENTITY"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#ENTITY"/> </xs:annotation> <xs:restriction base="xs:NCName"/> </xs:simpleType> <xs:simpleType name="integer" id="integer"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#integer"/> </xs:annotation> <xs:restriction base="xs:decimal"> <xs:fractionDigits fixed="true" value="0" id="integer.fractionDigits"/> <xs:pattern value="[\-+]?[0-9]+" id="integer.pattern"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="nonPositiveInteger" id="nonPositiveInteger"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#nonPositiveInteger"/> </xs:annotation> <xs:restriction base="xs:integer"> <xs:maxInclusive value="0" id="nonPositiveInteger.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="negativeInteger" id="negativeInteger"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#negativeInteger"/> </xs:annotation> <xs:restriction base="xs:nonPositiveInteger"> <xs:maxInclusive value="-1" id="negativeInteger.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="long" id="long"> <xs:annotation> <xs:appinfo> <hfp:hasProperty name="bounded" value="true"/> <hfp:hasProperty name="cardinality" value="finite"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#long"/> </xs:annotation> <xs:restriction base="xs:integer"> <xs:minInclusive value="-9223372036854775808" id="long.minInclusive"/> <xs:maxInclusive value="9223372036854775807" id="long.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="int" id="int"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#int"/> </xs:annotation> <xs:restriction base="xs:long"> <xs:minInclusive value="-2147483648" id="int.minInclusive"/> <xs:maxInclusive value="2147483647" id="int.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="short" id="short"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#short"/> </xs:annotation> <xs:restriction base="xs:int"> <xs:minInclusive value="-32768" id="short.minInclusive"/> <xs:maxInclusive value="32767" id="short.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="byte" id="byte"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#byte"/> </xs:annotation> <xs:restriction base="xs:short"> <xs:minInclusive value="-128" id="byte.minInclusive"/> <xs:maxInclusive value="127" id="byte.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="nonNegativeInteger" id="nonNegativeInteger"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#nonNegativeInteger"/> </xs:annotation> <xs:restriction base="xs:integer"> <xs:minInclusive value="0" id="nonNegativeInteger.minInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="unsignedLong" id="unsignedLong"> <xs:annotation> <xs:appinfo> <hfp:hasProperty name="bounded" value="true"/> <hfp:hasProperty name="cardinality" value="finite"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#unsignedLong"/> </xs:annotation> <xs:restriction base="xs:nonNegativeInteger"> <xs:maxInclusive value="18446744073709551615" id="unsignedLong.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="unsignedInt" id="unsignedInt"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#unsignedInt"/> </xs:annotation> <xs:restriction base="xs:unsignedLong"> <xs:maxInclusive value="4294967295" id="unsignedInt.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="unsignedShort" id="unsignedShort"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#unsignedShort"/> </xs:annotation> <xs:restriction base="xs:unsignedInt"> <xs:maxInclusive value="65535" id="unsignedShort.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="unsignedByte" id="unsignedByte"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#unsignedByte"/> </xs:annotation> <xs:restriction base="xs:unsignedShort"> <xs:maxInclusive value="255" id="unsignedByte.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="positiveInteger" id="positiveInteger"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#positiveInteger"/> </xs:annotation> <xs:restriction base="xs:nonNegativeInteger"> <xs:minInclusive value="1" id="positiveInteger.minInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="derivationControl"> <xs:annotation> <xs:documentation> A utility type, not for public use</xs:documentation> </xs:annotation> <xs:restriction base="xs:NMTOKEN"> <xs:enumeration value="substitution"/> <xs:enumeration value="extension"/> <xs:enumeration value="restriction"/> <xs:enumeration value="list"/> <xs:enumeration value="union"/> </xs:restriction> </xs:simpleType> <xs:group name="simpleDerivation"> <xs:choice> <xs:element ref="xs:restriction"/> <xs:element ref="xs:list"/> <xs:element ref="xs:union"/> </xs:choice> </xs:group> <xs:simpleType name="simpleDerivationSet"> <xs:annotation> <xs:documentation> #all or (possibly empty) subset of {restriction, extension, union, list} </xs:documentation> <xs:documentation> A utility type, not for public use</xs:documentation> </xs:annotation> <xs:union> <xs:simpleType> <xs:restriction base="xs:token"> <xs:enumeration value="#all"/> </xs:restriction> </xs:simpleType> <xs:simpleType> <xs:list> <xs:simpleType> <xs:restriction base="xs:derivationControl"> <xs:enumeration value="list"/> <xs:enumeration value="union"/> <xs:enumeration value="restriction"/> <xs:enumeration value="extension"/> </xs:restriction> </xs:simpleType> </xs:list> </xs:simpleType> </xs:union> </xs:simpleType> <xs:complexType name="simpleType" abstract="true"> <xs:complexContent> <xs:extension base="xs:annotated"> <xs:group ref="xs:simpleDerivation"/> <xs:attribute name="final" type="xs:simpleDerivationSet"/> <xs:attribute name="name" type="xs:NCName"> <xs:annotation> <xs:documentation> Can be restricted to required or forbidden </xs:documentation> </xs:annotation> </xs:attribute> </xs:extension> </xs:complexContent> </xs:complexType> <xs:complexType name="topLevelSimpleType"> <xs:complexContent> <xs:restriction base="xs:simpleType"> <xs:sequence> <xs:element ref="xs:annotation" minOccurs="0"/> <xs:group ref="xs:simpleDerivation"/> </xs:sequence> <xs:attribute name="name" type="xs:NCName" use="required"> <xs:annotation> <xs:documentation> Required at the top level </xs:documentation> </xs:annotation> </xs:attribute> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:restriction> </xs:complexContent> </xs:complexType> <xs:complexType name="localSimpleType"> <xs:complexContent> <xs:restriction base="xs:simpleType"> <xs:sequence> <xs:element ref="xs:annotation" minOccurs="0"/> <xs:group ref="xs:simpleDerivation"/> </xs:sequence> <xs:attribute name="name" use="prohibited"> <xs:annotation> <xs:documentation> Forbidden when nested </xs:documentation> </xs:annotation> </xs:attribute> <xs:attribute name="final" use="prohibited"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:restriction> </xs:complexContent> </xs:complexType> <xs:element name="simpleType" type="xs:topLevelSimpleType" id="simpleType"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-simpleType"/> </xs:annotation> </xs:element> <xs:group name="facets"> <xs:annotation> <xs:documentation> We should use a substitution group for facets, but that's ruled out because it would allow users to add their own, which we're not ready for yet. </xs:documentation> </xs:annotation> <xs:choice> <xs:element ref="xs:minExclusive"/> <xs:element ref="xs:minInclusive"/> <xs:element ref="xs:maxExclusive"/> <xs:element ref="xs:maxInclusive"/> <xs:element ref="xs:totalDigits"/> <xs:element ref="xs:fractionDigits"/> <xs:element ref="xs:maxScale"/> <xs:element ref="xs:minScale"/> <xs:element ref="xs:length"/> <xs:element ref="xs:minLength"/> <xs:element ref="xs:maxLength"/> <xs:element ref="xs:enumeration"/> <xs:element ref="xs:whiteSpace"/> <xs:element ref="xs:pattern"/> </xs:choice> </xs:group> <xs:group name="simpleRestrictionModel"> <xs:sequence> <xs:element name="simpleType" type="xs:localSimpleType" minOccurs="0"/> <xs:group ref="xs:facets" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:group> <xs:element name="restriction" id="restriction"> <xs:complexType> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-restriction"> base attribute and simpleType child are mutually exclusive, but one or other is required </xs:documentation> </xs:annotation> <xs:complexContent> <xs:extension base="xs:annotated"> <xs:group ref="xs:simpleRestrictionModel"/> <xs:attribute name="base" type="xs:QName" use="optional"/> </xs:extension> </xs:complexContent> </xs:complexType> </xs:element> <xs:element name="list" id="list"> <xs:complexType> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-list"> itemType attribute and simpleType child are mutually exclusive, but one or other is required </xs:documentation> </xs:annotation> <xs:complexContent> <xs:extension base="xs:annotated"> <xs:sequence> <xs:element name="simpleType" type="xs:localSimpleType" minOccurs="0"/> </xs:sequence> <xs:attribute name="itemType" type="xs:QName" use="optional"/> </xs:extension> </xs:complexContent> </xs:complexType> </xs:element> <xs:element name="union" id="union"> <xs:complexType> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-union"> memberTypes attribute must be non-empty or there must be at least one simpleType child </xs:documentation> </xs:annotation> <xs:complexContent> <xs:extension base="xs:annotated"> <xs:sequence> <xs:element name="simpleType" type="xs:localSimpleType" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="memberTypes" use="optional"> <xs:simpleType> <xs:list itemType="xs:QName"/> </xs:simpleType> </xs:attribute> </xs:extension> </xs:complexContent> </xs:complexType> </xs:element> <xs:complexType name="facet"> <xs:complexContent> <xs:extension base="xs:annotated"> <xs:attribute name="value" use="required"/> <xs:attribute name="fixed" type="xs:boolean" default="false" use="optional"/> </xs:extension> </xs:complexContent> </xs:complexType> <xs:complexType name="noFixedFacet"> <xs:complexContent> <xs:restriction base="xs:facet"> <xs:sequence> <xs:element ref="xs:annotation" minOccurs="0"/> </xs:sequence> <xs:attribute name="fixed" use="prohibited"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:restriction> </xs:complexContent> </xs:complexType> <xs:element name="minExclusive" type="xs:facet" id="minExclusive"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-minExclusive"/> </xs:annotation> </xs:element> <xs:element name="minInclusive" type="xs:facet" id="minInclusive"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-minInclusive"/> </xs:annotation> </xs:element> <xs:element name="maxExclusive" type="xs:facet" id="maxExclusive"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-maxExclusive"/> </xs:annotation> </xs:element> <xs:element name="maxInclusive" type="xs:facet" id="maxInclusive"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-maxInclusive"/> </xs:annotation> </xs:element> <xs:complexType name="numFacet"> <xs:complexContent> <xs:restriction base="xs:facet"> <xs:sequence> <xs:element ref="xs:annotation" minOccurs="0"/> </xs:sequence> <xs:attribute name="value" type="xs:nonNegativeInteger" use="required"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:restriction> </xs:complexContent> </xs:complexType> <xs:complexType name="intFacet"> <xs:complexContent> <xs:restriction base="xs:facet"> <xs:sequence> <xs:element ref="xs:annotation" minOccurs="0"/> </xs:sequence> <xs:attribute name="value" type="xs:integer" use="required"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:restriction> </xs:complexContent> </xs:complexType> <xs:element name="totalDigits" id="totalDigits"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-totalDigits"/> </xs:annotation> <xs:complexType> <xs:complexContent> <xs:restriction base="xs:numFacet"> <xs:sequence> <xs:element ref="xs:annotation" minOccurs="0"/> </xs:sequence> <xs:attribute name="value" type="xs:positiveInteger" use="required"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:restriction> </xs:complexContent> </xs:complexType> </xs:element> <xs:element name="fractionDigits" type="xs:numFacet" id="fractionDigits"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-fractionDigits"/> </xs:annotation> </xs:element> <xs:element name="maxScale" type="xs:intFacet" id="maxScale"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-maxScale"/> </xs:annotation> </xs:element> <xs:element name="minScale" type="xs:intFacet" id="minScale"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-minScale"/> </xs:annotation> </xs:element> <xs:element name="length" type="xs:numFacet" id="length"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-length"/> </xs:annotation> </xs:element> <xs:element name="minLength" type="xs:numFacet" id="minLength"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-minLength"/> </xs:annotation> </xs:element> <xs:element name="maxLength" type="xs:numFacet" id="maxLength"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-maxLength"/> </xs:annotation> </xs:element> <xs:element name="enumeration" type="xs:noFixedFacet" id="enumeration"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-enumeration"/> </xs:annotation> </xs:element> <xs:element name="whiteSpace" id="whiteSpace"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-whiteSpace"/> </xs:annotation> <xs:complexType> <xs:complexContent> <xs:restriction base="xs:facet"> <xs:sequence> <xs:element ref="xs:annotation" minOccurs="0"/> </xs:sequence> <xs:attribute name="value" use="required"> <xs:simpleType> <xs:restriction base="xs:NMTOKEN"> <xs:enumeration value="preserve"/> <xs:enumeration value="replace"/> <xs:enumeration value="collapse"/> </xs:restriction> </xs:simpleType> </xs:attribute> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:restriction> </xs:complexContent> </xs:complexType> </xs:element> <xs:element name="pattern" id="pattern"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#element-pattern"/> </xs:annotation> <xs:complexType> <xs:complexContent> <xs:restriction base="xs:noFixedFacet"> <xs:sequence> <xs:element ref="xs:annotation" minOccurs="0"/> </xs:sequence> <xs:attribute name="value" type="xs:string" use="required"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:restriction> </xs:complexContent> </xs:complexType> </xs:element> </xs:schema> DTD for Datatype Definitions (non-normative)

The DTD for the datatypes-specific aspects of schema documents is given below. Note there is no implication here that schema must be the root element of a document.

DTD for datatype definitions    <!ENTITY % simpleType "%p;simpleType"> <!ENTITY % restriction "%p;restriction"> <!ENTITY % list "%p;list"> <!ENTITY % union "%p;union"> <!ENTITY % maxExclusive "%p;maxExclusive"> <!ENTITY % minExclusive "%p;minExclusive"> <!ENTITY % maxInclusive "%p;maxInclusive"> <!ENTITY % minInclusive "%p;minInclusive"> <!ENTITY % totalDigits "%p;totalDigits"> <!ENTITY % fractionDigits "%p;fractionDigits"> <!ENTITY % maxScale "%p;maxScale"> <!ENTITY % minScale "%p;minScale"> <!ENTITY % length "%p;length"> <!ENTITY % minLength "%p;minLength"> <!ENTITY % maxLength "%p;maxLength"> <!ENTITY % enumeration "%p;enumeration"> <!ENTITY % whiteSpace "%p;whiteSpace"> <!ENTITY % pattern "%p;pattern">  <!ENTITY % simpleTypeAttrs ""> <!ENTITY % restrictionAttrs ""> <!ENTITY % listAttrs ""> <!ENTITY % unionAttrs ""> <!ENTITY % maxExclusiveAttrs ""> <!ENTITY % minExclusiveAttrs ""> <!ENTITY % maxInclusiveAttrs ""> <!ENTITY % minInclusiveAttrs ""> <!ENTITY % totalDigitsAttrs ""> <!ENTITY % fractionDigitsAttrs ""> <!ENTITY % lengthAttrs ""> <!ENTITY % minLengthAttrs ""> <!ENTITY % maxLengthAttrs ""> <!ENTITY % maxScaleAttrs ""> <!ENTITY % minScaleAttrs ""> <!ENTITY % enumerationAttrs ""> <!ENTITY % whiteSpaceAttrs ""> <!ENTITY % patternAttrs "">  <!ENTITY % URIref "CDATA"> <!ENTITY % XPathExpr "CDATA"> <!ENTITY % QName "NMTOKEN"> <!ENTITY % QNames "NMTOKENS"> <!ENTITY % NCName "NMTOKEN"> <!ENTITY % nonNegativeInteger "NMTOKEN"> <!ENTITY % boolean "(true|false)"> <!ENTITY % simpleDerivationSet "CDATA">   <!ENTITY % minBound "(%minInclusive; | %minExclusive;)"> <!ENTITY % maxBound "(%maxInclusive; | %maxExclusive;)"> <!ENTITY % bounds "%minBound; | %maxBound;"> <!ENTITY % numeric "%totalDigits; | %fractionDigits; | %minScale; | %maxScale;"> <!ENTITY % ordered "%bounds; | %numeric;"> <!ENTITY % unordered "%pattern; | %enumeration; | %whiteSpace; | %length; | %maxLength; | %minLength;"> <!ENTITY % facet "%ordered; | %unordered;"> <!ENTITY % facetAttr "value CDATA #REQUIRED id ID #IMPLIED"> <!ENTITY % fixedAttr "fixed %boolean; #IMPLIED"> <!ENTITY % facetModel "(%annotation;)?"> <!ELEMENT %simpleType; ((%annotation;)?, (%restriction; | %list; | %union;))> <!ATTLIST %simpleType; name %NCName; #IMPLIED final %simpleDerivationSet; #IMPLIED id ID #IMPLIED %simpleTypeAttrs;>  <!ELEMENT %restriction; ((%annotation;)?, (%restriction1; | ((%simpleType;)?,(%facet;)*)), (%attrDecls;))> <!ATTLIST %restriction; base %QName; #IMPLIED id ID #IMPLIED %restrictionAttrs;>  <!ELEMENT %list; ((%annotation;)?,(%simpleType;)?)> <!ATTLIST %list; itemType %QName; #IMPLIED id ID #IMPLIED %listAttrs;>  <!ELEMENT %union; ((%annotation;)?,(%simpleType;)*)> <!ATTLIST %union; id ID #IMPLIED memberTypes %QNames; #IMPLIED %unionAttrs;>  <!ELEMENT %maxExclusive; %facetModel;> <!ATTLIST %maxExclusive; %facetAttr; %fixedAttr; %maxExclusiveAttrs;> <!ELEMENT %minExclusive; %facetModel;> <!ATTLIST %minExclusive; %facetAttr; %fixedAttr; %minExclusiveAttrs;> <!ELEMENT %maxInclusive; %facetModel;> <!ATTLIST %maxInclusive; %facetAttr; %fixedAttr; %maxInclusiveAttrs;> <!ELEMENT %minInclusive; %facetModel;> <!ATTLIST %minInclusive; %facetAttr; %fixedAttr; %minInclusiveAttrs;> <!ELEMENT %totalDigits; %facetModel;> <!ATTLIST %totalDigits; %facetAttr; %fixedAttr; %totalDigitsAttrs;> <!ELEMENT %fractionDigits; %facetModel;> <!ATTLIST %fractionDigits; %facetAttr; %fixedAttr; %fractionDigitsAttrs;> <!ELEMENT %maxScale; %facetModel;> <!ATTLIST %maxScale; %facetAttr; %fixedAttr; %maxScaleAttrs;> <!ELEMENT %minScale; %facetModel;> <!ATTLIST %minScale; %facetAttr; %fixedAttr; %minScaleAttrs;> <!ELEMENT %length; %facetModel;> <!ATTLIST %length; %facetAttr; %fixedAttr; %lengthAttrs;> <!ELEMENT %minLength; %facetModel;> <!ATTLIST %minLength; %facetAttr; %fixedAttr; %minLengthAttrs;> <!ELEMENT %maxLength; %facetModel;> <!ATTLIST %maxLength; %facetAttr; %fixedAttr; %maxLengthAttrs;>  <!ELEMENT %enumeration; %facetModel;> <!ATTLIST %enumeration; %facetAttr; %enumerationAttrs;> <!ELEMENT %whiteSpace; %facetModel;> <!ATTLIST %whiteSpace; %facetAttr; %fixedAttr; %whiteSpaceAttrs;>  <!ELEMENT %pattern; %facetModel;> <!ATTLIST %pattern; %facetAttr; %patternAttrs;> Illustrative XML representations for the built-in simple type definitions Illustrative XML representations for the built-in primitive type definitions

The following, although in the form of a schema document, does not conform to the rules for schema documents defined in this specification. It contains explicit XML representations of the primitive datatypes which need not be declared in a schema document, since they are automatically included in every schema, and indeed must not be declared in a schema document, since it is forbidden to try to derive types with as the base type definition. It is included here as a form of documentation.

The (not a) schema document for primitive built-in type definitions <?xml version='1.0'?> <!DOCTYPE xs:schema SYSTEM "../namespace/XMLSchema.dtd" [  <!ENTITY % schemaAttrs 'xmlns:hfp CDATA #IMPLIED'> <!ELEMENT hfp:hasFacet EMPTY> <!ATTLIST hfp:hasFacet name NMTOKEN #REQUIRED> <!ELEMENT hfp:hasProperty EMPTY> <!ATTLIST hfp:hasProperty name NMTOKEN #REQUIRED value CDATA #REQUIRED> ]> <xs:schema xmlns:hfp="http://www.w3.org/2001/XMLSchema-hasFacetAndProperty" xmlns:xs="http://www.w3.org/2001/XMLSchema" blockDefault="#all" elementFormDefault="qualified" xml:lang="en" targetNamespace="http://www.w3.org/2001/XMLSchema"> <xs:annotation> <xs:documentation> First the built-in primitive datatypes.This document contains XML elements which look like definitions for the primitive datatypes. These definitions are for information only,; the real built-in definitions are magic. </xs:documentation> <xs:documentation> For each built-in datatype in this schema (both primitive and derived) can be uniquely addressed via a URI constructed as follows: 1) the base URI is the URI of the XML Schema namespace 2) the fragment identifier is the name of the datatype For example, to address the int datatype, the URI is: http://www.w3.org/2001/XMLSchema#int Additionally, each facet definition element can be uniquely addressed via a URI constructed as follows: 1) the base URI is the URI of the XML Schema namespace 2) the fragment identifier is the name of the facet For example, to address the maxInclusive facet, the URI is: http://www.w3.org/2001/XMLSchema#maxInclusive Additionally, each facet usage in a built-in datatype definition can be uniquely addressed via a URI constructed as follows: 1) the base URI is the URI of the XML Schema namespace 2) the fragment identifier is the name of the datatype, followed by a period (".") followed by the name of the facet For example, to address the usage of the maxInclusive facet in the definition of int, the URI is: http://www.w3.org/2001/XMLSchema#int.maxInclusive </xs:documentation> </xs:annotation> <xs:simpleType name="string" id="string"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#string"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace value="preserve" id="string.preserve"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="boolean" id="boolean"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="finite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#boolean"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="boolean.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="float" id="float"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="true"/> <hfp:hasProperty name="cardinality" value="finite"/> <hfp:hasProperty name="numeric" value="true"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#float"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="float.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="double" id="double"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="true"/> <hfp:hasProperty name="cardinality" value="finite"/> <hfp:hasProperty name="numeric" value="true"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#double"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="double.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="decimal" id="decimal"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="totalDigits"/> <hfp:hasFacet name="fractionDigits"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="total"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="true"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#decimal"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="decimal.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="precisionDecimal" id="precisionDecimal"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="totalDigits"/> <hfp:hasFacet name="maxScale"/> <hfp:hasFacet name="minScale"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="true"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#precisionDecimal"/> </xs:annotation> <xs:restriction base="xs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="precisionDecimal.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="duration" id="duration"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#duration"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="duration.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="dateTime" id="dateTime"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#dateTime"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="dateTime.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="time" id="time"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#time"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="time.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="date" id="date"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#date"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="date.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="gYearMonth" id="gYearMonth"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#gYearMonth"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="gYearMonth.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="gYear" id="gYear"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#gYear"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="gYear.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="gMonthDay" id="gMonthDay"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#gMonthDay"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="gMonthDay.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="gDay" id="gDay"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#gDay"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="gDay.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="gMonth" id="gMonth"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="maxInclusive"/> <hfp:hasFacet name="maxExclusive"/> <hfp:hasFacet name="minInclusive"/> <hfp:hasFacet name="minExclusive"/> <hfp:hasProperty name="ordered" value="partial"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#gMonth"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="gMonth.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="hexBinary" id="hexBinary"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#binary"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="hexBinary.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="base64Binary" id="base64Binary"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#base64Binary"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="base64Binary.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="anyURI" id="anyURI"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#anyURI"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="anyURI.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="QName" id="QName"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#QName"/> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="QName.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="NOTATION" id="NOTATION"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="pattern"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#NOTATION"/> <xs:documentation> NOTATION cannot be used directly in a schema; rather a type must be derived from it by specifying at least one enumeration facet whose value is the name of a NOTATION declared in the schema. </xs:documentation> </xs:annotation> <xs:restriction base="xs:anySimpleTypexs:anyAtomicType"> <xs:whiteSpace fixed="true" value="collapse" id="NOTATION.whiteSpace"/> </xs:restriction> </xs:simpleType> </xs:schema> Illustrative XML representations for the built-in ordinary type definitions

The following, although in the form of a schema document, contains XML representations of components already present in all schemas by definition. It is included here as a form of documentation.

These datatypes do not need to be declared in a schema document, since they are automatically included in every schema.

It is an open question whether this and similar XML documents should be accepted or rejected by software conforming to this specification. The XML Schema Working Group expects to resolve this question in connection with its work on issues relating to schema composition.

In the meantime, some existing schema processors will accept declarations for them; other existing processors will reject such declarations as duplicates.

Illustrative schema document for derived built-in type definitions <?xml version='1.0'?> <!DOCTYPE xs:schema SYSTEM "../namespace/XMLSchema.dtd" [  <!ENTITY % schemaAttrs 'xmlns:hfp CDATA #IMPLIED'> <!ELEMENT hfp:hasFacet EMPTY> <!ATTLIST hfp:hasFacet name NMTOKEN #REQUIRED> <!ELEMENT hfp:hasProperty EMPTY> <!ATTLIST hfp:hasProperty name NMTOKEN #REQUIRED value CDATA #REQUIRED> ]> <xs:schema xmlns:hfp="http://www.w3.org/2001/XMLSchema-hasFacetAndProperty" xmlns:xs="http://www.w3.org/2001/XMLSchema" blockDefault="#all" elementFormDefault="qualified" xml:lang="en" targetNamespace="http://www.w3.org/2001/XMLSchema"> <xs:annotation> <xs:documentation> NowThis document contains XML representations for the derived primitiveordinary non-primitive built-in datatypes </xs:documentation> </xs:annotation> <xs:simpleType name="normalizedString" id="normalizedString"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#normalizedString"/> </xs:annotation> <xs:restriction base="xs:string"> <xs:whiteSpace value="replace" id="normalizedString.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="token" id="token"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#token"/> </xs:annotation> <xs:restriction base="xs:normalizedString"> <xs:whiteSpace value="collapse" id="token.whiteSpace"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="language" id="language"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#language"/> </xs:annotation> <xs:restriction base="xs:token"> <xs:pattern value="[a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})*" id="language.pattern"> <xs:annotation> <xs:documentation source="http://www.ietf.org/rfc/rfc3066.txt"> pattern specifies the content of section 2.12 of XML 1.0e2 and RFC 3066 (Revised version of RFC 1766). </xs:documentation> </xs:annotation> </xs:pattern> </xs:restriction> </xs:simpleType> <xs:simpleType name="IDREFS" id="IDREFS"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="pattern"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#IDREFS"/> </xs:annotation> <xs:restriction> <xs:simpleType> <xs:list itemType="xs:IDREF"/> </xs:simpleType> <xs:minLength value="1" id="IDREFS.minLength"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="ENTITIES" id="ENTITIES"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="pattern"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#ENTITIES"/> </xs:annotation> <xs:restriction> <xs:simpleType> <xs:list itemType="xs:ENTITY"/> </xs:simpleType> <xs:minLength value="1" id="ENTITIES.minLength"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="NMTOKEN" id="NMTOKEN"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#NMTOKEN"/> </xs:annotation> <xs:restriction base="xs:token"> <xs:pattern value="\c+" id="NMTOKEN.pattern"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/REC-xml#NT-Nmtoken"> pattern matches production 7 from the XML spec </xs:documentation> </xs:annotation> </xs:pattern> </xs:restriction> </xs:simpleType> <xs:simpleType name="NMTOKENS" id="NMTOKENS"> <xs:annotation> <xs:appinfo> <hfp:hasFacet name="length"/> <hfp:hasFacet name="minLength"/> <hfp:hasFacet name="maxLength"/> <hfp:hasFacet name="enumeration"/> <hfp:hasFacet name="whiteSpace"/> <hfp:hasFacet name="pattern"/> <hfp:hasProperty name="ordered" value="false"/> <hfp:hasProperty name="bounded" value="false"/> <hfp:hasProperty name="cardinality" value="countably infinite"/> <hfp:hasProperty name="numeric" value="false"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#NMTOKENS"/> </xs:annotation> <xs:restriction> <xs:simpleType> <xs:list itemType="xs:NMTOKEN"/> </xs:simpleType> <xs:minLength value="1" id="NMTOKENS.minLength"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="Name" id="Name"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#Name"/> </xs:annotation> <xs:restriction base="xs:token"> <xs:pattern value="\i\c*" id="Name.pattern"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/REC-xml#NT-Name"> pattern matches production 5 from the XML spec </xs:documentation> </xs:annotation> </xs:pattern> </xs:restriction> </xs:simpleType> <xs:simpleType name="NCName" id="NCName"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#NCName"/> </xs:annotation> <xs:restriction base="xs:Name"> <xs:pattern value="[\i-[:]][\c-[:]]*" id="NCName.pattern"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/REC-xml-names/#NT-NCName"> pattern matches production 4 from the Namespaces in XML spec </xs:documentation> </xs:annotation> </xs:pattern> </xs:restriction> </xs:simpleType> <xs:simpleType name="ID" id="ID"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#ID"/> </xs:annotation> <xs:restriction base="xs:NCName"/> </xs:simpleType> <xs:simpleType name="IDREF" id="IDREF"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#IDREF"/> </xs:annotation> <xs:restriction base="xs:NCName"/> </xs:simpleType> <xs:simpleType name="ENTITY" id="ENTITY"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#ENTITY"/> </xs:annotation> <xs:restriction base="xs:NCName"/> </xs:simpleType> <xs:simpleType name="integer" id="integer"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#integer"/> </xs:annotation> <xs:restriction base="xs:decimal"> <xs:fractionDigits fixed="true" value="0" id="integer.fractionDigits"/> <xs:pattern value="[\-+]?[0-9]+" id="integer.pattern"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="nonPositiveInteger" id="nonPositiveInteger"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#nonPositiveInteger"/> </xs:annotation> <xs:restriction base="xs:integer"> <xs:maxInclusive value="0" id="nonPositiveInteger.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="negativeInteger" id="negativeInteger"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#negativeInteger"/> </xs:annotation> <xs:restriction base="xs:nonPositiveInteger"> <xs:maxInclusive value="-1" id="negativeInteger.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="long" id="long"> <xs:annotation> <xs:appinfo> <hfp:hasProperty name="bounded" value="true"/> <hfp:hasProperty name="cardinality" value="finite"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#long"/> </xs:annotation> <xs:restriction base="xs:integer"> <xs:minInclusive value="-9223372036854775808" id="long.minInclusive"/> <xs:maxInclusive value="9223372036854775807" id="long.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="int" id="int"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#int"/> </xs:annotation> <xs:restriction base="xs:long"> <xs:minInclusive value="-2147483648" id="int.minInclusive"/> <xs:maxInclusive value="2147483647" id="int.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="short" id="short"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#short"/> </xs:annotation> <xs:restriction base="xs:int"> <xs:minInclusive value="-32768" id="short.minInclusive"/> <xs:maxInclusive value="32767" id="short.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="byte" id="byte"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#byte"/> </xs:annotation> <xs:restriction base="xs:short"> <xs:minInclusive value="-128" id="byte.minInclusive"/> <xs:maxInclusive value="127" id="byte.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="nonNegativeInteger" id="nonNegativeInteger"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#nonNegativeInteger"/> </xs:annotation> <xs:restriction base="xs:integer"> <xs:minInclusive value="0" id="nonNegativeInteger.minInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="unsignedLong" id="unsignedLong"> <xs:annotation> <xs:appinfo> <hfp:hasProperty name="bounded" value="true"/> <hfp:hasProperty name="cardinality" value="finite"/> </xs:appinfo> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#unsignedLong"/> </xs:annotation> <xs:restriction base="xs:nonNegativeInteger"> <xs:maxInclusive value="18446744073709551615" id="unsignedLong.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="unsignedInt" id="unsignedInt"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#unsignedInt"/> </xs:annotation> <xs:restriction base="xs:unsignedLong"> <xs:maxInclusive value="4294967295" id="unsignedInt.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="unsignedShort" id="unsignedShort"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#unsignedShort"/> </xs:annotation> <xs:restriction base="xs:unsignedInt"> <xs:maxInclusive value="65535" id="unsignedShort.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="unsignedByte" id="unsignedByte"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#unsignedByte"/> </xs:annotation> <xs:restriction base="xs:unsignedShort"> <xs:maxInclusive value="255" id="unsignedByte.maxInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="positiveInteger" id="positiveInteger"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#positiveInteger"/> </xs:annotation> <xs:restriction base="xs:nonNegativeInteger"> <xs:minInclusive value="1" id="positiveInteger.minInclusive"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="yearMonthDuration"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#yearMonthDuration"> This type includes just those durations expressed in years and months. Since the pattern given excludes days, hours, minutes, and seconds, the values of this type have a seconds property of zero. They are totally ordered. </xs:documentation> </xs:annotation> <xs:restriction base="xs:duration"> <xs:pattern id="yearMonthDuration.pattern" value="[^DT]*"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="dayTimeDuration"> <xs:annotation> <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#dayTimeDuration"> This type includes just those durations expressed in days, hours, minutes, and seconds. The pattern given excludes years and months, so the values of this type have a months property of zero. They are totally ordered. </xs:documentation> </xs:annotation> <xs:restriction base="xs:duration"> <xs:pattern id="dayTimeDuration.pattern" value="[^YM]*(T.*)?"/> </xs:restriction> </xs:simpleType> </xs:schema> Built-up Value Spaces

Some datatypes, such as , describe well-known mathematically abstract systems. Others, such as the date/time datatypes, describe real-life, applied systems. Certain of the systems described by datatypes, both abstract and applied, have values in their value spaces most easily described as things having several properties, which in turn have values which are in some sense primitive or are from the value spaces of simpler datatypes.

In this document, the arguments to functions are assumed to be call by value unless explicitly noted to the contrary, meaning that if the argument is modified during the processing of the algorithm, that modification is not reflected in the outside world. On the other hand, the arguments to procedures are assumed to be call by location, meaning that modifications are so reflected, since that is the only way the processing of the algorithm can have any effect.

Properties always have values. An optional property is permitted but not required to have the special value absent.

Those values that are more primitive, and are used (among other things) herein to construct object value spaces but which we do not explicitly define are described here:

A number (without precision) is an ordinary mathematical number; 1, 1.0, and 1.000000000000 are the same number. The decimal numbers and integers generally used in the algorithms of appendix are such ordinary numbers, not carrying precision.

An enumerated constant is an undefined thing whose only property is that it is unequal to any other constants and to any member of any defined datatype.

(There are a few constants which are specified by name to be members of the value space of more than one primitive datatype. Such constants are differentiated by their name and associated datatype; this is because members of the value space of distinct primitive datatypes are always distinct. Apart from that, constants are differentiated one from the other by their name. They have no other inherent properties; their effect is defined in the context in which they occur. Examples of constants are positiveInfinity and absent.

Numerical Values

The following standard operators are defined here in case the reader is unsure of their definition:

If m and n are numbers, then m div n is the greatest integer less than or equal to m / n .

If m and n are numbers, then m mod n is m − n × ( m n) .

n 1 is a convenient and short way of expressing the greatest integer less than or equal to n.

Exact Lexical Mappings Numerals and Fragments Thereof digit0-9 unsignedNoDecimalPtNumeral+ noDecimalPtNumeral(+ | -)? fracFrag+ unsignedDecimalPtNumeral ( . ?) | (. ) unsignedFullDecimalPtNumeral . decimalPtNumeral(+ | -)? unsignedScientificNotationNumeral ( | ) (e | E) scientificNotationNumeral (+ | -)?

Some numerical datatypes include some or all of three non-numerical values: positiveInfinity, negativeInfinity, and notANumber. Their lexical spaces include non-numeral lexical representations for these non-numeric values: Special Non-numerical Lexical Representations Used With Numerical Datatypes minimalNumericalSpecialRep INF | -INF | NaN numericalSpecialRep +INF |

Date/time Values

There are several different primitive but related datatypes defined in the specification which pertain to various combinations of dates and times, and parts thereof. They all use related value-space models, which are described in detail in this section. It is not difficult for a casual reader of the descriptions of the individual datatypes elsewhere in this specification to misunderstand some of the details of just what the datatypes are intended to represent, so more detail is presented here in this section.

All of the value spaces for dates and times described here represent moments or periods of time in Universal Coordinated Time (UTC). Universal Coordinated Time (UTC) is an adaptation of TAI which closely approximates UT1 by adding leap-seconds to selected days.

A leap-second is an additional second added to the last day of December, June, October, or March, when such an adjustment is deemed necessary by the International Earth Rotation and Reference Systems Service in order to keep within 0.9 seconds of observed astronomical time. When leap seconds are introduced, the last minute in the day has more than sixty seconds. In theory leap seconds can also be removed from a day, but this has not yet occurred. Leap seconds are not supported by the types defined here.

Because the type and other date- and time-related types defined in this specification do not support leap seconds, there are portions of the timeline which cannot be represented by values of these types. Users whose applications require that leap seconds be represented and that date/time arithmetic take historically occurring leap seconds into account will wish to make appropriate adjustments at the application level, or to use other types.

The Seven-property Model

There are two distinct ways to model moments in time: either by tracking their year, month, day, hour, minute and second (with fractional seconds as needed), or by tracking their time (measured generally in seconds or days) from some starting moment. Each has its advantages. The two are isomorphic. For definiteness, we choose to model the first using five integer and one decimal number properties. We superimpose the second by providing one decimal number-valued function which gives the corresponding count of seconds from zero (the time on the time line).

There is also a seventh property which specifies the timezone, which is conceptually a duration measuring the offset of times with that timezone from . Values for the six primary properties are always stored in their local values (the values shown in the lexical representations), rather than converted to . Properties of

Date/time Seven-property Models

year an integer month an integer between 1 and 12 inclusive day an integer between 1 and 31 inclusive, possibly restricted further depending on and hour an integer between 0 and 23 inclusive minute an integer between 0 and 59 inclusive second a decimal number greater than or equal to 0 and less than 60. timezone an integer between −840 and 840 inclusive

Non-negative values of the properties map to the years, months, days of month, etc. of the Gregorian calendar in the obvious way. Values less than 1582 in the property represent years in the proleptic Gregorian calendar. A value of zero in the property represents the year 1 BCE; a value of −1 represents the year 2 BCE, −2 is 3 BCE, etc.

The model just described is called herein the seven-property model for date/time datatypes. It is used as is for ; all other date/time datatypes except use the same model except that some of the six primary properties are required to have the value absent, instead of being required to have a numerical value. (An property, like , is always permitted to have the value absent.)

values are limited to 14 hours, which is 840 (= 60 × 14) minutes.

Leap-seconds are not permitted

As of the time this specification was published, leap-seconds (always one leap-second) have been introduced by the responsible authorities at the end (in ) of the following days (see ):

1972-06-30

1972-12-31

1973-12-31

1974-12-31

1975-12-31

1976-12-31

1977-12-31

1978-12-31

1979-12-31

1981-06-30

1982-06-30

1983-06-30

1985-06-30

1987-12-31

1989-12-31

1990-12-31

1992-06-30

1993-06-30

1994-06-30

1995-12-31

1997-06-30

1998-12-31

2005-12-31

Because the simple types defined here do not support leap seconds, they cannot be used to represent the final second, in , of any of the days listed above. If it is important, at the application level, to track the occurrence of leap seconds, then users will need to make special arrangements for special handling of the dates above and of time intervals crossing them.

While calculating, property values from the 1972-12-31T00:00:00 are used to fill in for those that are absent, except that if is absent but is not, the largest permitted day for that month is used. 1972-12-31T00:00:00 happens to permit both the maximum number of days and the maximum number of seconds.

Values from any one date/time datatype using the seven-component model (all except ) are ordered the same as their values, except that if one value's is absent and the other's is not, and using maximum and minimum values for the one whose is actually absent changes the resulting (strict) inequality, the original two values are incomparable.

Lexical Mappings

Each lexical representation is made up of certain date/time fragments, each of which corresponds to a particular property of the datatype value. They are defined by the following productions. Date/time Lexical Representation Fragments yearFrag -? ((1-9 +)) | (0 )) monthFrag (0 1-9) | (1 0-2) dayFrag (0 1-9) | (12 ) | (3 01) hourFrag (01 ) | (2 0-3) minuteFrag 0-5 secondFrag (0-5 ) (. +)? endOfDayFrag 24:00:00 (. 0+)? timezoneFrag Z | ((+ | -) (0 | 1 0-4) : )

Each fragment other than defines a subset of the of ; the corresponding is the lexical mapping restricted to that subset. These fragment lexical mappings are combined separately for each date/time datatype (other than ) to make up the complete lexical mapping for that datatype. The mapping is used to obtain the value of the property, the mapping is used to obtain the value of the property, etc. Each datatype which specifies some properties to be mandatorily absent also does not permit the corresponding lexical fragments in its lexical representations.

(The redundancy between Z, +00:00, and -00:00, and the possibility of trailing fractional 0 digits for , are the only redundancies preventing these mappings from being one-to-one.)

The following fragment canonical mappings for each value-object property are combined as appropriate to make the for each date/time datatype (other than ):

Function Definitions

The more important functions and procedures defined here are summarized in the text When there is a text summary, the name of the function in each is a hot-link to the same name in the other. All other links to these functions link to the complete definition in this section.

Generic Number-related Functions

The following functions are used with various numeric and date/time datatypes.

Auxiliary Functions for Operating on Numeral Fragments digitValue integera nonnegative integer less than ten dmatches

Maps each digit to its numerical value.

Return

0 when d = 0 ,

1 when d = 1 ,

2 when d = 2 ,

etc.

digitSequenceValue integera nonnegative integer Sa finite sequence of literals, each term matching .

Maps a sequence of digits to the position-weighted sum of the terms numerical values.

Return the sum of (S_i) × 10^{length(S)−i} where i runs over the domain of S. fractionDigitSequenceValue integera nonnegative integer Sa finite sequence of literals, each term matching .

Maps a sequence of digits to the position-weighted sum of the terms numerical values, weighted appropriately for fractional digits.

Return the sum of (S_i) − 10⁻ⁱ where i runs over the domain of S. fractionFragValue decimal numbera nonnegative decimal number Nmatches

Maps a to the appropriate fractional decimal number.

N is necessarily the left-to-right concatenation of a finite sequence S of literals, each term matching . Return (S). Generic Numeral-to-Number Lexical Mappings unsignedNoDecimalMap integera nonnegative integer Nmatches

Maps an to its numerical value.

N is the left-to-right concatenation of a finite sequence S of literals, each term matching . Return (S). noDecimalMap integeran integer Nmatches

Maps an to its numerical value.

N necessarily consists of an optional sign(+ or -) and then a U that matches . Return

−1 × (U) when - is present, and

(U) otherwise.

unsignedDecimalPtMap decimal numbera nonnegative decimal number Dmatches

Maps an to its numerical value.

D necessarily consists of an optional N matching , a decimal point, and then an optional F matching . Return

(N) when F is not present,

(F) when N is not present, and

(N) + (F) otherwise.

decimalPtMap decimal numbera decimal number Nmatches

Maps a to its numerical value.

N necessarily consists of an optional sign(+ or -) and then an instance U of . Return

−(U) when - is present, and

(U) otherwise.

scientificMap decimal numbera decimal number Nmatches

Maps a to its numerical value.

N necessarily consists of an instance C of either or , either an e or an E, and then an instance E of . Return

(C) − 10 ^ (E) when a . is present in N, and

(C) − 10 ^ (E) otherwise.

Auxiliary Functions for Producing Numeral Fragments digit matches ibetween 0 and 9 inclusive

Maps each integer between 0 and 9 to the corresponding .

Return

0 when i = 0 ,

1 when i = 1 ,

2 when i = 2 ,

etc.

digitRemainderSeq sequence of integerssequence of nonnegative integers ia nonnegative integer

Maps each nonnegative integer to a sequence of integers used by to ultimately create an .

Return that sequence s for which

s₀ = i and

s_j+1 = s_j 10 .

digitSeq sequence of integerssequence of integers where each term is between 0 and 9 inclusive ia nonnegative integer

Maps each nonnegative integer to a sequence of integers used by to create an .

Return that sequence s for which s_j =(i)_j 10 . lastSignificantDigit integera nonnegative integer sa sequence of nonnegative integers

Maps a sequence of nonnegative integers to the index of the first zero term.

Return the smallest nonnegative integer j such that s(i)_j+1 is 0. FractionDigitRemainderSeq sequence of decimal numbersa sequence of nonnegative decimal numbers fnonnegative and less than 1

Maps each nonnegative decimal number less than 1 to a sequence of decimal numbers used by to ultimately create an .

Return that sequence s for which

s₀ = f − 10 , and

s_j+1 = (s_j 1) − 10 .

fractionDigitSeq sequence of integersa sequence of integer;s where each term is between 0 and 9 inclusive fnonnegative and less than 1

Maps each nonnegative decimal number less than 1 to a sequence of integers used by to ultimately create an .

Return that sequence s for which s_j = (f)_j 1 . fractionDigitsCanonicalFragmentMap matches fnonnegative and less than 1

Maps each nonnegative decimal number less than 1 to a used by to create an .

Return ((f)₀) & . . . & ((f)_((f))) . Generic Number to Numeral Canonical Mappings unsignedNoDecimalPtCanonicalMap matches ia nonnegative integer

Maps a nonnegative integer to a , its .

Return ((i)_((i))) & . . . & ((i)₀) . (Note that the concatenation is in reverse order.) noDecimalPtCanonicalMap matches ian integer

Maps an integer to a , its .

Return

- & (−i) when i is negative,

(i) otherwise.

unsignedDecimalPtCanonicalMap matches na nonnegative decimal number

Maps a nonnegative decimal number to a , its .

Return (n1) & . & (n1) . decimalPtCanonicalMap matches na decimal number

Maps a decimal number to a , its .

Return

- & (−i) when i is negative,

(i) otherwise.

unsignedScientificCanonicalMap matches na nonnegative decimal number

Maps a nonnegative decimal number to a , its .

Return (n / 10^log(n) 1) & E & (log(n) 1) scientificCanonicalMap matches na decimal number

Maps a decimal number to a , its .

Return

- & (−n) when n is negative,

(i) otherwise.

For example:

123.4567 1 = 0.4567 and 123.4567 1 = 123 .

(123) is 123 , 12 , 1 , 0 , 0 , . . . .

(123) is 3 , 2 , 1 , 0 , 0 , . . . .

((123)) = 2 (Sequences count from 0.)

(123) = 123

(0.4567) is 4.567 , 5.67 , 6.7 , 7 , 0 , 0 , . . . .

(0.4567) is 4 , 5 , 6 , 7 , 0 , 0 , . . . .

((0.4567)) = 3

(0.4567) = 4567

(123.4567) = 123.4567

Lexical Mapping for Non-numerical s Used With Numerical Datatypes specialRepValue one of positiveInfinity, negativeInfinity, or notANumber. Smatches

Maps the lexical representations of s used with some numerical datatypes to those s.

Return

positiveInfinity when S is INF or +INF,

negativeInfinity when S is -INF, and

notANumber when S is NaN

Canonical Mapping for Non-numerical s Used With Numerical Datatypes specialRepCanonicalMap matches cone of positiveInfinity, negativeInfinity, and notANumber

Maps the s used with some numerical datatypes to their canonical representations.

Return

INF when c is positiveInfinity

-INF when c is negativeInfinity

NaN when c is notANumber

Auxiliary Functions for Reading Instances of decimalPtPrecision integeran integer LEXmatches

Maps a onto an integer; used in calculating the of a value.

LEX necessarily contains a decimal point (.) and may optionally contain a following F consisting of some number n of s. Return

n when F is present, and

0 otherwise.

scientificPrecision integeran integer LEXmatches

Maps a onto an integer; used in calculating the of a value.

LEX necessarily contains a or C preceding an exponent indicator (E or e, and a following E. Return

−1 × (E) when C is a , and

(C) − (E) otherwise.

Lexical Mapping precisionDecimalLexicalMap a value LEXmatches

Maps a onto a complete value.

pD be a complete value.

Set pD's to

(LEX) when LEX is an instance of ,

(LEX) when LEX is an instance of and

(LEX) otherwise.

Set pD's to

0 when LEX is a ,

(LEX) when LEX is a ,

(LEX) when LEX is a , and

absent otherwise

Set pD's to

absent when LEX is NaN

negative when the first character of LEX is -, and

positive otherwise.

Return pD.

Lexical Mapping decimalLexicalMap a value LEXmatches

Maps a onto a value.

d be a value.

Set d to

(LEX) when LEX is an instance of , and

(LEX) when LEX is an instance of ,

Return d.

Canonical Mapping decimalCanonicalMap a matching da value

Maps a to its , a .

If d is an integer, then return (d).

Otherwise, return (d).

Auxiliary Functions for Binary Floating-point Lexical/Canonical Mappings floatingPointRound decimal number or a decimal number or (INF or -INF) nVan initially non-zero decimal number (may be set to zero during calculations) cWidtha positive integer eMinan integer eMaxan integer greater than eMin

Rounds a non-zero decimal number to the nearest floating-point value.

s be an integer intially 1,

c be a nonnegative integer, and

e be an integer.

Set s to −1 when nV < 0 .

So select e that 2^cWidth × 2^(e − 1) < |nV| ≤ 2^cWidth × 2^e .

So select c that (c − 1) × 2^e ≤ |nV | <c × 2^e and 2^cWidth−1 < c ≤ 2^cWidth .

when eMax < e (overflow) return:

positiveInfinity when s is positive, and

negativeInfinity otherwise.

otherwise:

When e < eMin (underflow):

Set e = eMin

So select c that (c − 1) × 2^e ≤ |nV | <c × 2^e .

Set nV to

c × 2^e when |nV | > c × 2^e − 2^(e−1) ;

(c − 1) × 2^e when |nV | < c × 2^e − 2^(e−1) ;

c × 2^e or (c − 1) × 2^e according to whether c is even or c − 1 is even, otherwise (i.e., |nV | = c × 2^e − 2^(e−1) , the midpoint between the two values).

Return

s × nV when nV < 2^cWidth × 2^eMax,

positiveInfinity when s is positive, and

negativeInfinity otherwise.

Implementers will find the algorithms of more efficient in memory than the simple abstract algorithm employed above.

round decimal numbera decimal number na decimal number ka nonnegative integer

Maps a decimal number to that value rounded by some power of 10.

Return ((n / 10^k + 0.5) 1) × 10^k . floatApprox decimal numbera decimal number ca nonnegative integer ean integer ja nonnegative integer

Maps a decimal number ( c × 10^e ) to successive approximations.

Return (c, j ) × 10^e Lexical Mapping floatLexicalMap a value LEXmatches

Maps a onto a value.

nV be a decimal number or (INF or −INF).

Return (LEX) when LEX is an instance of ;

otherwise (LEX is a numeral):

Set nV to

(LEX) when LEX is an instance of ,

(LEX) when LEX is an instance of , and

(LEX) otherwise (LEX is an instance of ).

Set nV to (nV, 24, −149, 104) when nV is not zero. ( may nonetheless return zero, or INF or −INF.)

Return:

When nV is zero:

negativeZero when the first character of LEX is -, and

positiveZero otherwise.

nV otherwise.

This specification permits the substitution of any other rounding algorithm which conforms to the requirements of .

Lexical Mapping doubleLexicalMap a value LEXmatches

Maps a onto a value.

nV be a decimal number or (INF or −INF).

Return (LEX) when LEX is an instance of ;

otherwise (LEX is a numeral):

Set nV to

(LEX) when LEX is an instance of ,

(LEX) when LEX is an instance of , and

(LEX) otherwise (LEX is an instance of ).

Set nV to (nV, 53, −1074, 971) when nV is not zero. ( may nonetheless return zero, or INF or −INF.)

Return:

When nV is zero:

negativeZero when the first character of LEX is -, and

positiveZero otherwise.

nV otherwise.

This specification permits the substitution of any other rounding algorithm which conforms to the requirements of .

Canonical Mapping floatCanonicalMap a matching fa value

Maps a to its , a .

l be a nonnegative integer

s be an integer intially 1,

c be a positive integer, and

e be an integer.

Return (f ) when f is one of positiveInfinity, negativeInfinity, or notANumber;

return 0.0E0 when f is positiveZero;

return -0.0E0 when f is negativeZero;

otherwise (f is numeric and non-zero):

Set s to −1 when f < 0 .

Let c be the smallest integer for which there exists an integer e for which |f | = c × 10^e .

Let e be log₁₀(|f | / c) (so that |f | = c × 10^e ).

Let l be the largest nonnegative integer for which c × 10^e = ((c, e, l ), 24, −149, 104)

Return (s × (c, e, l )) .

Canonical Mapping doubleCanonicalMap a matching fa value

Maps a to its , a .

l be a nonnegative integer

s be an integer intially 1,

c be a positive integer, and

e be an integer.

Return (f ) when f is one of positiveInfinity, negativeInfinity, or notANumber;

return 0.0E0 when f is positiveZero;

return -0.0E0 when f is negativeZero;

otherwise (f is numeric and non-zero):

Set s to −1 when f < 0 .

Let c be the smallest integer for which there exists an integer e for which |f | = c × 10^e .

Let e be log₁₀(|f | / c) (so that |f | = c × 10^e ).

Let l be the largest nonnegative integer for which c × 10^e = ((c, e, l ), 53, −1074, 971)

Return (s × (c, e, l )) .

Canonical Mapping precisionDecimalCanonicalMap a matching pDa value

Maps a to its , a .

Let nV be the of pD.

Let aP be the of pD.

If pD is one of NaN, INF, or -INF, then return (nV).

Otherwise, if nV is an integer and aP is zero and 1E-6 ≤ nV ≤ 1E6, then return (nV).

Otherwise, if aP is greater than zero and 1E-6 ≤ nV ≤ 1E6, then let s be (nV). Let f be the number of fractional digits in s; f will invariably be less than or equal to aP. Return the concatenation of s with aP − f occurrences of the digit 0.

Otherwise, it will be the case that nV is less than 1E−6 or greater than 1E6. Let

s be (nV).

m be the part of s which precedes the E.

n be the part of s which follows the E.

p be the integer denoted by n.

f be the number of fractional digits in m; note that f will invariably be less than or equal to aP + p.

t be a string consisting of aP + p − f occurrences of the digit 0, preceded by a decimal point if and only if m contains no decimal point and aP + p − f is greater than zero.

Return the concatenation m & t & E & n.

Duration-related Definitions

The following functions are primarily used with the datatype and its derivatives. Auxiliary -related Functions Operating on Representation Fragments duYearFragmentMap integera nonnegative integer Ymatches

Maps a to an integer, intended as part of the value of the property of a value.

Y is necessarily the letter Y followed by a numeral N: Return (N). duMonthFragmentMap integera nonnegative integer Mmatches

Maps a to an integer, intended as part of the value of the property of a value.

M is necessarily the letter M followed by a numeral N: Return (N). duDayFragmentMap integera nonnegative integer Dmatches

Maps a to an integer, intended as part of the value of the property of a value.

D is necessarily the letter D followed by a numeral N: Return (N). duHourFragmentMap integera nonnegative integer Hmatches

Maps a to an integer, intended as part of the value of the property of a value.

D is necessarily the letter D followed by a numeral N: Return (N). duMinuteFragmentMap integera nonnegative integer Mmatches

Maps a to an integer, intended as part of the value of the property of a value.

M is necessarily the letter M followed by a numeral N: Return (N). duSecondFragmentMap decimal numbera nonnegative decimal number Smatches

Maps a to a decimal number, intended as part of the value of the property of a value.

S is necessarily S followed by a numeral N: Return

(N) when . occurs in N, and

(N) otherwise.

duYearMonthFragmentMap integera nonnegative integer YMmatches

Maps a into an integer, intended as part of the property of a value.

YM necessarily consists of an instance Y of and/or an instance M of :

y be (Y) (or 0 if Y is not present) and

m be (M) (or 0 if M is not present).

Return 12 × y + m . duTimeFragmentMap decimal numbera nonnegative decimal number Tmatches

Maps a into a decimal number, intended as part of the property of a value.

T necessarily consists of an instance H of , and/or an instance M of , and/or an instance S of .

h be (H) (or 0 if H is not present),

m be (M) (or 0 if M is not present), and

s be (S) (or 0 if S is not present).

Return 3600 × h + 60 × m + s . duDayTimeFragmentMap decimal numbera nonnegative decimal number DTmatches

Maps a into a decimal number, which is the potential value of the property of a value.

DT necesarily consists of an instance D of and/or an instance T of .

d be (D) (or 0 if D is not present) and

t be (T) (or 0 if T is not present).

Return 86400 × d + t . The Lexical Mapping durationMap a complete value DURmatches

Separates the into the month part and the seconds part, then maps them into the and of the value.

DUR consists of possibly a leading -, followed by P and then an instance Y of and/or an instance D of : Return a whose

value is

0 if Y is not present,

−(Y) if both - and Y are present, and

(Y) otherwise.

and whose

value is

0 if D is not present,

−(D) if both - and D are present, and

(D) otherwise.

The Lexical Mapping yearMonthDurationMap a complete value YMmatches

Maps the lexical representation into the of a value. (A 's is always zero.) is a restriction of .

YM necessarily consists of an optional leading -, followed by P and then an instance Y of : Return a whose

value is

−(Y) if - is present in YM and

(Y) otherwise, and

value is (necessarily) 0.

The Lexical Mapping dayTimeDurationMap a complete value DTa value

Maps the lexical representation into the of a value. (A 's is always zero.) is a restriction of .

DT necessarily consists of possibly a leading -, followed by P and then an instance D of : Return a whose

value is (necessarily) 0, and

value is

−(D) if - is present in DT and

(D) otherwise.

Auxiliary -related Functions Producing Representation Fragments duYearMonthCanonicalFragmentMap a matching yma nonnegative integer

Maps a nonnegative integer, presumably the absolute value of the of a value, to a , a fragment of a .

y be ym 12 , and

m be ym 12 ,

Return

(y) & Y & (m) & M when neither y nor m is zero,

(y) & Y when y is not zero but m is, and

(m) & M when y is zero.

duDayCanonicalFragmentMap a matching da nonnegative integer

Maps a nonnegative integer, presumably the day normalized value from the of a value, to a , a fragment of a .

Return

(d) & D when d is not zero, and

the empty string () when d is zero.

duHourCanonicalFragmentMap a matching ha nonnegative integer

Maps a nonnegative integer, presumably the hour normalized value from the of a value, to a , a fragment of a .

Return

(h) & H when h is not zero, and

the empty string () when h is zero.

duMinuteCanonicalFragmentMap a matching ma nonnegative integer

Maps a nonnegative integer, presumably the minute normalized value from the of a value, to a , a fragment of a .

Return

(m) & M when m is not zero, and

the empty string () when m is zero.

duSecondCanonicalFragmentMap matches sa nonnegative decimal number

Maps a nonnegative decimal number, presumably the second normalized value from the of a value, to a , a fragment of a .

Return

(s) & S when s is a non-zero integer,

(s) & S when s is not an integer, and

the empty string () when s is zero.

duTimeCanonicalFragmentMap a matching ha nonnegative integer ma nonnegative integer sa nonnegative decimal number

Maps three nonnegative numbers, presumably the hour, minute, and second normalized values from a 's , to a , a fragment of a .

Return

T & (h) & (m) & (s) when h, m, and s are not all zero, and

the empty string () when all arguments are zero.

duDayTimeCanonicalFragmentMap matches ssa nonnegative decimal number

Maps a nonnegative decimal number, presumably the absolute value of the of a value, to a , a fragment of a .

d is ss 86400 ,

h is (ss 86400) 3600 ,

m is (ss 3600) 60 , and

s is ss 60 ,

Return

(d) & (h, m, s) when ss is not zero and

T0S when ss is zero.

The Canonical Mapping durationCanonicalMap matches va complete value

Maps a 's property values to fragments and combines the fragments into a complete .

m be v's ,

s be v's , and

sgn be - if m or s is negative and the empty string () otherwise.

Return

sgn & P & (| m |) & (| s |) when neither m nor s is zero,

sgn & P & (| m |) when m is not zero but s is, and

sgn & P & (| s |) when m is zero.

The Canonical Mapping yearMonthDurationCanonicalMap matches yma complete value

Maps a 's value to a . (The value is necessarily zero and is ignored.) is a restriction of .

m be ym's and

sgn be - if m is negative and the empty string () otherwise.

Return sgn & P & (| m |) . The Canonical Mapping dayTimeDurationCanonicalMap matches dta complete value

Maps a 's value to a . (The value is necessarily zero and is ignored.) is a restriction of .

s be dt's and

sgn be - if s is negative and the empty string () otherwise.

Return sgn & P & (| s |) .

Date/time-related Definitions Normalization of property values

When adding and subtracting numbers from date/time properties, the immediate results may not conform to the limits specified. Accordingly, the following procedures are used to normalize potential property values to corresponding values that do conform to the appropriate limits. Normalization is required when dealing with timezone changes (as when converting to from local values) and when adding values to or subtracting them from values. Date/time Datatype Normalizing Procedures normalizeMonth yran integer moan integer

If month (mo) is out of range, adjust month and year (yr) accordingly; otherwise, make no change.

Add (mo − 1) 12 to yr.

Set mo to (mo − 1) 12 + 1 .

normalizeDay yran integer moan integer daan integer

If month (mo) is out of range, or day (da) is out of range for the appropriate month, then adjust values accordingly, otherwise make no change.

(yr, mo)

Repeat until da is positive and not greater than (yr, mo):

If da exceeds the upper limit from the table then:

Subtract that limit from da.

Add 1 to mo.

(yr, mo)

If da is not positive then:

Subtract 1 from mo.

(yr, mo)

Add the new upper limit from the table to da.

normalizeMinute yran integer moan integer daan integer hran integer mian integer

Normalizes minute, hour, month, and year values to values that obey the appropriate constraints.

Add mi 60 to hr.

Set mi to mi 60 .

Add hr 24 to da.

Set hr to hr 24 .

(yr, mo, da).

normalizeSecond yran integer moan integer daan integer hran integer mian integer sea decimal number

Normalizes second, minute, hour, month, and year values to values that obey the appropriate constraints. (This algorithm ignores leap seconds.)

Add se 60 to mi.

Set se to se 60 .

(yr, mo, da, hr, mi).

Auxiliary Functions

Date/time Auxiliary Functions daysInMonth integer between 28 and 31 inclusive yan integer man integer between 1 and 12

Returns the number of the last day of the month for any combination of year and month.

Return:

28 when m is 2 and y is not evenly divisible by 4, or is evenly divisible by 100 but not by 400, or is absent,

29 when m is 2 and y is evenly divisible by 400, or is evenly divisible by 4 but not by 100,

30 when m is 4, 6, 9, or 11,

31 otherwise (m is 1, 3, 5, 7, 8, 10, or 12)

newDateTime an instance of the Yran integer Moan integer between 1 and 12 inclusive Daan integer between 1 and 31 inclusive Hran integer between 0 and 24 inclusive Mian integer between 0 and 59 inclusive Sean decimal number greater than or equal to 0 and less than 60 Tzan duration between -PT14H and PT14H, inclusive.

Returns an instance of the with property values as specified in the arguments. If an argument is omitted, the corresponding property is set to absent.

dt be an instance of the

the property of dt be Year

the property of dt be Mo

the property of dt be Da

the property of dt be Hr

the property of dt be Mi

the property of dt be Se

the property of dt be Tz

Return dt.

Adding durations to dateTimes

Given a S and a D, function specifies how to compute a E, where E is the end of the time period with start S and duration D i.e. E = S + D. Such computations are used, for example, to determine whether a is within a specific time period. This algorithm can also be applied, when applications need the operation, to the addition of s to the datatypes , , , and , each of which can be viewed as denoting a set of s. In such cases, the addition is made to the first or starting in the set. Note that the extension of this algorithm to types other than is not needed for schema-validity assessment.

Essentially, this calculation adds the and properties of the value separately to the value. The value is added to the starting value first. If the day is out of range for the new month value, it is pinned to be within range. Thus April 31 turns into April 30. Then the value is added. This latter addition can cause the year, month, day, hour, and minute to change.

Leap seconds are ignored by the computation. All calculations use 60 seconds per minute.

Thus the addition of either PT1M or PT60S to any dateTime will always produce the same result. This is a special definition of addition which is designed to match common practice, and—most importantly—be stable over time.

A definition that attempted to take leap-seconds into account would need to be constantly updated, and could not predict the results of future implementation's additions. The decision to introduce a leap second in is the responsibility of the . They make periodic announcements as to when leap seconds are to be added, but this is not known more than a year in advance. For more information on leap seconds, see .

Adding to dateTimePlusDuration a value dua valuedta value

Adds a to a value, producing another value.

yr be dt's ,

mo be dt's ,

da be dt's ,

hr be dt's ,

mi be dt's , and

se be dt's .

tz be dt's .

Add du's to mo.

(yr, mo). (I.e., carry any over- or underflow, adjust month.)

Set da to min(da, (yr, mo)). (I.e., pin the value if necessary.)

Add du's to se.

(yr, mo, da, hr, mi, se). (I.e., carry over- or underflow of seconds up to minutes, hours, etc.)

Return (yr, mo, da, hr, mi, se, tz)

This algorithm may be applied to date/time types other than , by

For each absent property, supply the minimum legal value for that property (1 for years, months, days, 0 for hours, minutes, seconds).

Call the function.

For each property absent in the initial value, set the corresponding property in the result value to absent.

Examples:

dateTime	duration	result
2000-01-12T12:13:14Z	P1Y3M5DT7H10M3.3S	2001-04-17T19:23:17.3Z
2000-01	-P3M	1999-10
2000-01-12	PT33H	2000-01-13

Note that the addition defined by differs from addition on integers or real numbers in not being commutative. The order of addition of durations to instants is significant. For example, there are cases where: ((dateTime + duration1) + duration2) != ((dateTime + duration2) + duration1)

Example:

(2000-03-30 + P1D) + P1M = 2000-03-31 + P1M = 2000-04-30

(2000-03-30 + P1M) + P1D = 2000-04-30 + P1D = 2000-05-01

Time on timeline Time on Timeline for Date/time Seven-property Model Datatypes timeOnTimeline decimal numbera decimal number dta value

Maps a value to the decimal number representing its position on the time line.

yr be 1971 when dt's is absent, and dt's − 1 otherwise,

mo be 12 or dt's , similarly,

da be (yr+1, mo) − 1 or (dt's ) − 1 , similarly,

hr be 0 or dt's , similarly, and

mi be 0 or dt's , similarly.

Subtract from mi when is not absent.

()

Set ToTl to 31536000 × yr .

(Leap-year Days, , and )

Add 86400 × (yr 400 − yr 100 + yr 4) to ToTl.

Add 86400 × Sum_m < mo (yr + 1, m) to ToTl

Add 86400 × da to ToTl.

(, , and )

Add 3600 × hr + 60 × mi + se to ToTl.

Return ToTl.

Lexical mappings Partial Date/time Lexical Mappings yearFragValue integeran integer YRmatches

Maps a , part of a 's , onto an integer, presumably the property of a value.

Return (YR) monthFragValue integeran integer MOmatches

Maps a , part of a 's , onto an integer, presumably the property of a value.

Return (MO) dayFragValue integeran integer DAmatches

Maps a , part of a 's , onto an integer, presumably the property of a value.

Return (DA) hourFragValue integeran integer HRmatches

Maps a , part of a 's , onto an integer, presumably the property of a value.

Return (HR) minuteFragValue integeran integer MImatches

Maps a , part of a 's , onto an integer, presumably the property of a value.

Return (MI) secondFragValue decimal numbera decimal number SEmatches

Maps a , part of a 's , onto a decimal number, presumably the property of a value.

Return

(SE) when no decimal point occurs in SE, and

(SE) otherwise.

timezoneFragValue integeran integer TZmatches

Maps a , part of a 's , onto an integer, presumably the property of a value.

TZ necessarily consists of either just Z, or a sign (+ or -) followed by an instance H of , a colon, and an instance M of Return

0 when TZ is Z,

−((H) × 60 + (M)) when the sign is -, and

(H) × 60 + (M) otherwise.

Lexical Mapping dateTimeLexicalMap a complete value LEXmatches

Maps a to a value.

LEX necessarily includes an instance Y of , an instance MO of , and an instance D of hyphen-separated, an instance H of , an instance MI of , and an instance S of , colon-separated and optionally followed by an instance T of . tz be (T) when T is present, otherwise absent.

Return ((Y), (MO), (D), (H), (MI), (S), tz)

Lexical Mapping timeLexicalMap a complete value LEXmatches

Maps a to a value.

LEX necessarily includes an instance H of , an instance M of , and an instance S of , colon-separated and optionally followed by an instance T of . tz be (T) when T is present, otherwise absent

Return (absent, absent, absent, (H), (M), (S), tz).

Lexical Mapping dateLexicalMap a complete value LEXmatches

Maps a to a value.

LEX necessarily includes an instance Y of , an instance M of , and an instance D of , hyphen-separated and optionally followed by an instance T of . tz be (T) when T is present, otherwise absent

Return ((Y), (M), (D), absent, absent, absent, tz).

Lexical Mapping gYearMonthLexicalMap a complete value LEXmatches

Maps a to a value.

LEX necessarily includes an instance Y of and an instance M of , hyphen-separated and optionally followed by an instance T of . tz be (T) when T is present, otherwise absent.

Return ((Y), (M), absent, absent, absent, absent, tz)

Lexical Mapping gYearLexicalMap a complete value LEX matches

Maps a to a value.

LEX necessarily includes an instance Y of , optionally followed by an instance T of . tz be (T) when T is present, otherwise absent.

Return ((Y), absent, absent, absent, absent, absent, tz).

Lexical Mapping gMonthDayLexicalMap a complete value LEXmatches

Maps a to a value.

LEX necessarily includes an instance M of and an instance D of , hyphen-separated and optionally followed by an instance T of . tz be (T) when T is present, otherwise absent.

Return (gMD, absent, (Y), (M), absent, absent, absent, tz)

Lexical Mapping gDayLexicalMap a complete value LEX matches

Maps a to a value.

LEX necessarily includes an instance D of , optionally followed by an instance T of . tz be (T) when T is present, otherwise absent.

Return (gD, absent, absent, (D), absent, absent, absent, tz).

Lexical Mapping gMonthLexicalMap a complete value LEX matches

Maps a to a value.

LEX necessarily includes an instance M of , optionally followed by an instance T of . tz be (T) when T is present, otherwise absent.

Return (gM, absent, (M), absent, absent, absent, absent, tz)

Canonical Mappings Auxiliary Functions for Date/time Canonical Mappings unsTwoDigitCanonicalFragmentMap matches ia nonnegative integer less than 100

Maps a nonnegative integer less than 100 onto an unsigned always-two-digit numeral.

Return (i 10) & (i 10) fourDigitCanonicalFragmentMap matches ian integer whose absolute value is less than 10000

Maps an integer between -10000 and 10000 onto an always-four-digit numeral.

Return

- & (−i 100) & (−i 100) when i is negative,

(i 100) & (i 100) otherwise.

Partial Date/time Canonical Mappings yearCanonicalFragmentMap matches yan integer

Maps an integer, presumably the property of a value, onto a , part of a 's .

Return

(y) when |y| > 9999 .

(y) otherwise.

monthCanonicalFragmentMap matches man integer between 1 and 12 inclusive

Maps an integer, presumably the property of a value, onto a , part of a 's .

Return (m) dayCanonicalFragmentMap matches dan integer between 1 and 31 inclusive (may be limited further depending on associated and )

Maps an integer, presumably the property of a value, onto a , part of a 's .

Return (d) hourCanonicalFragmentMap matches han integer between 0 and 23 inclusive.

Maps an integer, presumably the property of a value, onto a , part of a 's .

Return (h) minuteCanonicalFragmentMap matches man integer between 0 and 59 inclusive.

Maps an integer, presumably the property of a value, onto a , part of a 's .

Return (m) secondCanonicalFragmentMap matches sa nonnegative decimal number less than 70

Maps a decimal number, presumably the property of a value, onto a , part of a 's .

Return

(s) when s is an integer, and

(s1) & . & (s1) otherwise.

timezoneCanonicalFragmentMap matches tan integer between −840 and 840 inclusive

Maps an integer, presumably the property of a value, onto a , part of a 's .

Return

Z when t is zero,

- & (−t 60) & : & (−t 60) when t is negative, and

+ & (t 60) & : & (t 60) otherwise.

Canonical Mapping dateTimeCanonicalMap matches dta complete value

Maps a value to a .

DT be (dt's ) & - & (dt's ) & - & (dt's ) & T & (dt's ) & : & (dt's ) & : & (dt's ) . Return

DT when dt's is absent, and

DT & (dt's ) otherwise.

Canonical Mapping timeCanonicalMap matches tia complete value

Maps a value to a .

T be (ti's ) & : & (ti's ) & : & (ti's ) . Return

T when ti's is absent, and

T & (ti's ) otherwise.

Canonical Mapping dateCanonicalMap matches daa complete value

Maps a value to a .

D be (da's ) & - & (da's ) & - & (da's ) . Return

D when da's is absent, and

D & (da's ) otherwise.

Canonical Mapping gYearMonthCanonicalMap matches yma complete value

Maps a value to a .

YM be (ym's ) & - & (ym's ) . Return

YM when ym's is absent, and

YM & (ym's ) otherwise.

Canonical Mapping gYearCanonicalMap matches gYa complete value

Maps a value to a .

Return

(gY's ) when gY's is absent, and

(gY's ) & (gY's ) otherwise.

Canonical Mapping gMonthDayCanonicalMap matches mda complete value

Maps a value to a .

MD be -- & (md's ) & - & (md's ) . Return

MD when md's is absent, and

MD & (md's ) otherwise.

Canonical Mapping gDayCanonicalMap matches gDa complete value

Maps a value to a .

Return

--- & (gD's ) when gD's is absent, and

--- & (gD's ) & (gD's ) otherwise.

Canonical Mapping gMonthCanonicalMap matches gM a complete value

Maps a value to a .

Return

-- & (gM's ) when gM's is absent, and

-- & (gM's ) & (gM's ) otherwise.

Lexical and Canonical Mappings for Other Datatypes

The following functions are used with various datatypes neither numeric nor date/time related.

Lexical Mapping stringLexicalMap A value LEXa matching

Maps a matching the production to a value.

Return LEX. (The function is the identity function on the domain.) Lexical Mapping booleanLexicalMap A value LEXa matching

Maps a matching the production to a value.

Return

true when LEX is true or 1 , and

false otherwise (LEX is false or 0).

Canonical Mapping stringCanonicalMap matches sa value

Maps a value to a .

Return s. (The function is the identity function on the domain.) Canonical Mapping booleanCanonicalMap matches ba value

Maps a value to a .

Return

true when b is true, and

false otherwise (b is false).

Lexical and canonical mappings for

The for maps each pair of hexadecimal digits to an octet, in the conventional way:

Lexical Mapping for hexBinary hexBinaryMap A sequence of binary octets in the form of a value LEXa matching

Maps a matching the production to a sequence of octets in the form of a value.

LEX necessarily includes a sequence of zero or more substrings matching the production. o be the sequence of octets formed by applying to each in LEX, in order, and concatenating the results. Return o.

The auxiliary functions and are used by .

Mappings for hexadecimal digits hexOctetMap octet A single binary octet LEXa matching

Maps a matching the production to a single octet.

LEX necessarily includes exactly two hexadecimal digits. d0 be the first hexadecimal digit in LEX. Let d1 be the second hexadecimal digit in LEX. Return the octet whose four high-order bits are (d0) and whose four low-order bits are (d1). hexDigitMap a bit-sequence of length four a sequence of four binary digits d a hexadecimal digit

Maps a hexadecimal digit (a character matching the production) to a sequence of four binary digits.

Return

0000 when d = 0,

0001 when d = 1,

0010 when d = 2,

0011 when d = 3,

...

1110 when d = E or e,

1111 when d = F or f.

The for uses only the uppercase forms of A-F.

Canonical Mapping for hexBinary hexBinaryCanonical matches oa value

Maps a value to a literal matching the production.

h be the sequence of literals formed by applying to each octet in o, in order, and concatenating the results. Return h. Auxiliary procedures for canonical mapping of hexOctetCanonical matches oa binary octet

Maps a binary octet to a literal matching the production.

lo be the four low-order bits of o, and hi be the four high-order bits. Return (hi) & (lo). hexDigitCanonical matches ba sequence of four binary digits

Maps a four-bit sequence to a hexadecimal digit (a literal matching the production).

Return

0 when d = 0000,

1 when d = 0001,

2 when d = 0010,

3 when d = 0011,

...

E when d = 1110,

F when d = 1111.

Datatypes and Facets ISO 8601 Date and Time Formats ISO 8601 Conventions

The datatypes , , , , , , , and use lexical formats inspired by . Following , the lexical forms of these datatypes can include only the characters #20 through #7F. This appendix provides more detail on the ISO formats and discusses some deviations from them for the datatypes defined in this specification.

"specifies the representation of dates in the proleptic Gregorian calendar and times and representations of periods of time". The proleptic Gregorian calendar includes dates prior to 1582 (the year it came into use as an ecclesiastical calendar). It should be pointed out that the datatypes described in this specification do not cover all the types of data covered by , nor do they support all the lexical representations for those types of data.

lexical formats are described using "pictures" in which characters are used in place of decimal digits. The allowed decimal digits are (#x30-#x39). For the primitive datatypes , , , , , , and . these characters have the following meanings:

C -- represents a digit used in the thousands and hundreds components, the "century" component, of the time element "year". Legal values are from 0 to 9.

Y -- represents a digit used in the tens and units components of the time element "year". Legal values are from 0 to 9.

M -- represents a digit used in the time element "month". The two digits in a MM format can have values from 1 to 12.

D -- represents a digit used in the time element "day". The two digits in a DD format can have values from 1 to 28 if the month value equals 2, 1 to 29 if the month value equals 2 and the year is a leap year, 1 to 30 if the month value equals 4, 6, 9 or 11, and 1 to 31 if the month value equals 1, 3, 5, 7, 8, 10 or 12.

h -- represents a digit used in the time element "hour". The two digits in a hh format can have values from 0 to 24. If the value of the hour element is 24 then the values of the minutes element and the seconds element must be 00 and 00.

m -- represents a digit used in the time element "minute". The two digits in a mm format can have values from 0 to 59.

s -- represents a digit used in the time element "second". The two digits in a ss format can have values from 0 to 60. In the formats described in this specification the whole number of seconds be followed by decimal seconds to an arbitrary level of precision. This is represented in the picture by "ss.sss". A value of 60 or more is allowed only in the case of leap seconds.

Strictly speaking, a value of 60 or more is not sensible unless the month and day could represent March 31, June 30, September 30, or December 31 in . Because the leap second is added or subtracted as the last second of the day in time, the long (or short) minute could occur at other times in local time. In cases where the leap second is used with an inappropriate month and day it, and any fractional seconds, should considered as added or subtracted from the following minute.

For all the information items indicated by the above characters, leading zeroes are required where indicated.

In addition to the above, certain characters are used as designators and appear as themselves in lexical formats.

T -- is used as time designator to indicate the start of the representation of the time of day in .

Z -- is used as time-zone designator, immediately (without a space) following a data element expressing the time of day in Coordinated Universal Time () in , , , , , , , and .

In the lexical format for the following characters are also used as designators and appear as themselves in lexical formats:

P -- is used as the time duration designator, preceding a data element representing a given duration of time.

Y -- follows the number of years in a time duration.

M -- follows the number of months or minutes in a time duration.

D -- follows the number of days in a time duration.

H -- follows the number of hours in a time duration.

S -- follows the number of seconds in a time duration.

The values of the Year, Month, Day, Hour and Minutes components are not restricted but allow an arbitrary integer. Similarly, the value of the Seconds component allows an arbitrary decimal. Thus, the lexical format for and datatypes derived from it does not follow the alternative format of § 5.5.3.2.1 of .

Truncated and Reduced Formats

supports a variety of "truncated" formats in which some of the characters on the left of specific formats, for example, the century, can be omitted. Truncated formats are, in general, not permitted for the datatypes defined in this specification with three exceptions. The datatype uses a truncated format for which represents an instant of time that recurs every day. Similarly, the and datatypes use left-truncated formats for . The datatype uses a right and left truncated format for .

also supports a variety of "reduced" or right-truncated formats in which some of the characters to the right of specific formats, such as the time specification, can be omitted. Right truncated formats are also, in general, not permitted for the datatypes defined in this specification with the following exceptions: right-truncated representations of are used as lexical representations for , , .

Deviations from ISO 8601 Formats Sign Allowed

An optional minus sign is allowed immediately preceding, without a space, the lexical representations for , , , , .

No Year Zero

The year "0000" is an illegal year value.

More Than 9999 Years

To accommodate year values greater than 9999, more than four digits are allowed in the year representations of , , , and . This follows .

Time zone permitted

The lexical representations for the datatypes , , , , and permit an optional trailing time zone specificiation.

Adding durations to dateTimes

Given a S and a D, this appendix specifies how to compute a E where E is the end of the time period with start S and duration D i.e. E = S + D. Such computations are used, for example, to determine whether a is within a specific time period. This appendix also addresses the addition of s to the datatypes , , , and , which can be viewed as a set of s. In such cases, the addition is made to the first or starting in the set.

This is a logical explanation of the process. Actual implementations are free to optimize as long as they produce the same results. The calculation uses the notation S[year] to represent the year field of S, S[month] to represent the month field, and so on. It also depends on the following functions:

fQuotient(a, b) = the greatest integer less than or equal to a/b

fQuotient(-1,3) = -1

fQuotient(0,3)...fQuotient(2,3) = 0

fQuotient(3,3) = 1

fQuotient(3.123,3) = 1

modulo(a, b) = a - fQuotient(a,b)*b

modulo(-1,3) = 2

modulo(0,3)...modulo(2,3) = 0...2

modulo(3,3) = 0

modulo(3.123,3) = 0.123

fQuotient(a, low, high) = fQuotient(a - low, high - low)

fQuotient(0, 1, 13) = -1

fQuotient(1, 1, 13) ... fQuotient(12, 1, 13) = 0

fQuotient(13, 1, 13) = 1

fQuotient(13.123, 1, 13) = 1

modulo(a, low, high) = modulo(a - low, high - low) + low

modulo(0, 1, 13) = 12

modulo(1, 1, 13) ... modulo(12, 1, 13) = 1...12

modulo(13, 1, 13) = 1

modulo(13.123, 1, 13) = 1.123

maximumDayInMonthFor(yearValue, monthValue) =

M := modulo(monthValue, 1, 13)

Y := yearValue + fQuotient(monthValue, 1, 13)

Return a value based on M and Y:

31 M = January, March, May, July, August, October, or December

30 M = April, June, September, or November

29 M = February AND (modulo(Y, 400) = 0 OR (modulo(Y, 100) != 0) AND modulo(Y, 4) = 0)

28 Otherwise

Algorithm

Essentially, this calculation is equivalent to separating D into <year,month> and <day,hour,minute,second> fields. The <year,month> is added to S. If the day is out of range, it is pinned to be within range. Thus April 31 turns into April 30. Then the <day,hour,minute,second> is added. This latter addition can cause the year and month to change.

Leap seconds are handled by the computation by treating them as overflows. Essentially, a value of 60 seconds in S is treated as if it were a duration of 60 seconds added to S (with a zero seconds field). All calculations thereafter use 60 seconds per minute.

The following is the precise specification. These steps must be followed in the same order. If a field in D is not specified, it is treated as if it were zero. If a field in S is not specified, it is treated in the calculation as if it were the minimum allowed value in that field, however, after the calculation is concluded, the corresponding field in E is removed (set to unspecified).

Months (may be modified additionally below)

temp := S[month] + D[month]

E[month] := modulo(temp, 1, 13)

carry := fQuotient(temp, 1, 13)

Years (may be modified additionally below)

E[year] := S[year] + D[year] + carry

Zone

E[zone] := S[zone]

Seconds

temp := S[second] + D[second]

E[second] := modulo(temp, 60)

carry := fQuotient(temp, 60)

Minutes

temp := S[minute] + D[minute] + carry

E[minute] := modulo(temp, 60)

carry := fQuotient(temp, 60)

Hours

temp := S[hour] + D[hour] + carry

E[hour] := modulo(temp, 24)

carry := fQuotient(temp, 24)

Days

if S[day] > maximumDayInMonthFor(E[year], E[month])

tempDays := maximumDayInMonthFor(E[year], E[month])

else if S[day] < 1

tempDays := 1

else

tempDays := S[day]

E[day] := tempDays + D[day] + carry

START LOOP

IF E[day] < 1

E[day] := E[day] + maximumDayInMonthFor(E[year], E[month] - 1)

carry := -1

ELSE IF E[day] > maximumDayInMonthFor(E[year], E[month])

E[day] := E[day] - maximumDayInMonthFor(E[year], E[month])

carry := 1

ELSE EXIT LOOP

temp := E[month] + carry

E[month] := modulo(temp, 1, 13)

E[year] := E[year] + fQuotient(temp, 1, 13)

GOTO START LOOP

Examples:

dateTime	duration	result
2000-01-12T12:13:14Z	P1Y3M5DT7H10M3.3S	2001-04-17T19:23:17.3Z
2000-01	-P3M	1999-10
2000-01-12	PT33H	2000-01-13

Commutativity and Associativity

Time durations are added by simply adding each of their fields, respectively, without overflow.

The order of addition of durations to instants is significant. For example, there are cases where:

((dateTime + duration1) + duration2) != ((dateTime + duration2) + duration1)

Example:

(2000-03-30 + P1D) + P1M = 2000-03-31 + P1M = 2000-04-30

(2000-03-30 + P1M) + P1D = 2000-04-30 + P1D = 2000-05-01

Regular Expressions

A R is a sequence of characters that denote a set of strings L(R). When used to constrain a , a regular expression R asserts that only strings in L(R) are valid literals for values of that type.

Unlike some popular regular expression languages (including those defined by Perl and standard Unix utilities), the regular expression language defined here implicitly anchors all regular expressions at the head and tail, as the most common use of regular expressions in is to match entire literals. For example, a datatype derived from such that all values must begin with the character A (#x41) and end with the character Z (#x5a) would be defined as follows:

In regular expression languages that are not implicitly anchored at the head and tail, it is customary to write the equivalent regular expression as:

^A.*Z$

where "^" anchors the pattern at the head and "$" anchors at the tail.

In those rare cases where an unanchored match is desired, including .* at the beginning and ending of the regular expression will achieve the desired results. For example, a datatype derived from string such that all values must contain at least 3 consecutive A (#x41) characters somewhere within the value could be defined as follows:

A regular expression is composed from zero or more es, separated by | characters.

Regular Expression regExp ( '|' )*

For all es S, and for all s T, valid s R are: Denoting the set of strings L(R) containing:

(empty string) the set containing just the empty string

S all strings in L(S)

S|T all strings in L(S) and all strings in L(T)

For all es S, and for all s T, valid s R are:	Denoting the set of strings L(R) containing:
(empty string)	the set containing just the empty string
S	all strings in L(S)
S\|T	all strings in L(S) and all strings in L(T)

A branch consists of zero or more s, concatenated together.

Branch branch *

For all s S, and for all es T, valid es R are: Denoting the set of strings L(R) containing:

S all strings in L(S)

ST all strings st with s in L(S) and t in L(T)

For all s S, and for all es T, valid es R are:	Denoting the set of strings L(R) containing:
S	all strings in L(S)
ST	all strings st with s in L(S) and t in L(T)

A piece is an , possibly followed by a .

Piece piece ?

For all s S and non-negative integers n, m such that n <= m, valid s R are: Denoting the set of strings L(R) containing:

S all strings in L(S)

S? the empty string, and all strings in L(S).

S* All strings in L(S?) and all strings st with s in L(S*) and t in L(S). ( all concatenations of zero or more strings from L(S) )

S+ All strings st with s in L(S) and t in L(S*). ( all concatenations of one or more strings from L(S) )

S{n,m} All strings st with s in L(S) and t in L(S{n-1,m-1}). ( All sequences of at least n, and at most m, strings from L(S) )

S{n} All strings in L(S{n,n}). ( All sequences of exactly n strings from L(S) )

S{n,} All strings in L(S{n}S*) ( All sequences of at least n, strings from L(S) )

S{0,m} All strings st with s in L(S?) and t in L(S{0,m-1}). ( All sequences of at most m, strings from L(S) )

S{0,0} The set containing only the empty string

For all s S and non-negative integers n, m such that n <= m, valid s R are:	Denoting the set of strings L(R) containing:
S	all strings in L(S)
S?	the empty string, and all strings in L(S).
S*	All strings in L(S?) and all strings st with s in L(S*) and t in L(S). ( all concatenations of zero or more strings from L(S) )
S+	All strings st with s in L(S) and t in L(S*). ( all concatenations of one or more strings from L(S) )
S{n,m}	All strings st with s in L(S) and t in L(S{n-1,m-1}). ( All sequences of at least n, and at most m, strings from L(S) )
S{n}	All strings in L(S{n,n}). ( All sequences of exactly n strings from L(S) )
S{n,}	All strings in L(S{n}S*) ( All sequences of at least n, strings from L(S) )
S{0,m}	All strings st with s in L(S?) and t in L(S{0,m-1}). ( All sequences of at most m, strings from L(S) )
S{0,0}	The set containing only the empty string

The regular expression language in the Perl Programming Language does not include a quantifier of the form S{,m}, since it is logically equivalent to S{0,m}. We have, therefore, left this logical possibility out of the regular expression language defined by this specification.

A quantifier is one of ?, *, +, {n,m} or {n,}, which have the meanings defined in the table above.

Quantifier quantifier [?*+] | ( '{' '}' ) quantity | | quantRange ',' quantMin ',' QuantExact [0-9]+

An atom is either a , a , or a parenthesized .

Atom atom | | ( '(' ')' )

For all s c, es C, and s S, valid s R are: Denoting the set of strings L(R) containing:

c the single string consisting only of c

C all strings in L(C)

(S) all strings in L(S)

For all s c, es C, and s S, valid s R are:	Denoting the set of strings L(R) containing:
c	the single string consisting only of c
C	all strings in L(C)
(S)	all strings in L(S)

A metacharacter is either ., \, ?, *, +, {, } (, ), |, [, or ]. These characters have special meanings in s, but can be escaped to form s that denote the sets of strings containing only themselves, i.e., an escaped behaves like a .

A normal character is any XML character that is not a metacharacter. In s, a normal character is an atom that denotes the singleton set of strings containing only itself.

Normal Character Char [^.\?*+{}()|#x5B#x5D]

Note that a can be represented either as itself, or with a character reference.

Character Classes

A character class is an R that identifies a set of characters C(R). The set of strings L(R) denoted by a character class R contains one single-character string "c" for each character c in C(R).

Character Class charClass | |

A character class is either a or a .

A character class expression is a surrounded by [ and ] characters. For all character groups G, [G] is a valid character class expression, identifying the set of characters C([G]) = C(G).

Character Class Expression charClassExpr '[' ']'

A character group is either a , a , or a .

Character Group charGroup | |

A positive character group consists of one or more s or s, concatenated together. A positive character group identifies the set of characters containing all of the characters in all of the sets identified by its constituent ranges or escapes.

Positive Character Group posCharGroup ( | )+

For all s R, all s E, and all s P, valid s G are: Identifying the set of characters C(G) containing:

R all characters in C(R).

E all characters in C(E).

RP all characters in C(R) and all characters in C(P).

EP all characters in C(E) and all characters in C(P).

For all s R, all s E, and all s P, valid s G are:	Identifying the set of characters C(G) containing:
R	all characters in C(R).
E	all characters in C(E).
RP	all characters in C(R) and all characters in C(P).
EP	all characters in C(E) and all characters in C(P).

A negative character group is a preceded by the ^ character. For all s P, ^P is a valid negative character group, and C(^P) contains all XML characters that are not in C(P).

Negative Character Group negCharGroup '^'

A character class subtraction is a subtracted from a or , using the - character.

Character Class Subtraction charClassSub ( | ) '-'

For any or G, and any C, G-C is a valid , identifying the set of all characters in C(G) that are not also in C(C).

A character range R identifies a set of characters C(R) containing all XML characters with UCS code points in a specified range.

Character Range charRange | seRange '-' charOrEsc | XmlChar [^\#x2D#x5B#x5D] XmlCharIncDash [^\#x5B#x5D]

A single XML character is a that identifies the set of characters containing only itself. All XML characters are valid character ranges, except as follows:

The [, ], - and \ characters are not valid character ranges;

The ^ character is only valid at the beginning of a if it is part of a

The - character is a valid character range only at the beginning or end of a .

The grammar for as given above is ambiguous, but the second and third bullets above together remove the ambiguity.

A also be written in the form s-e, identifying the set that contains all XML characters with UCS code points greater than or equal to the code point of s, but not greater than the code point of e.

s-e is a valid character range iff:

s is a , or an XML character;

s is not \

If s is the first character in a , then s is not ^

e is a , or an XML character;

e is not \ or [; and

The code point of e is greater than or equal to the code point of s;

The code point of a is the code point of the single character in the set of characters that it identifies.

Character Class Escapes

A character class escape is a short sequence of characters that identifies predefined character class. The valid character class escapes are the s, the s, and the s (including the s).

Character Class Escape charClassEsc ( | | | )

A single character escape identifies a set containing a only one character -- usually because that character is difficult or impossible to write directly into a .

Single Character Escape SingleCharEsc '\' [nrt\|.?*+(){}#x2D#x5B#x5D#x5E]

The valid s are: Identifying the set of characters C(R) containing:

\n the newline character (#xA)

\r the return character (#xD)

\t the tab character (#x9)

\\ \

\| |

\. .

\- -

\^ ^

\? ?

\* *

\+ +

\{ {

\} }

$ (

$ )

\[ [

\] ]

The valid s are:	Identifying the set of characters C(R) containing:
`\n`	the newline character (#xA)
`\r`	the return character (#xD)
`\t`	the tab character (#x9)
`\\`	\
`\\|`	\|
`\.`	.
`\-`	-
`\^`	^
`\?`	?
`\*`	*
`\+`	+
`\{`	{
`\}`	}
`\(`	(
`\)`	)
`\[`	[
`\]`	]

specifies a number of possible values for the "General Category" property and provides mappings from code points to specific character properties. The set containing all characters that have property X, can be identified with a category escape \p{X}. The complement of this set is specified with the category escape \P{X}. ([\P{X}] = [^\p{X}]).

Category Escape catEsc '\p{' '}' complEsc '\P{' '}' charProp |

is subject to future revision. For example, the mapping from code points to character properties might be updated. All processors support the character properties defined in the version of that is current at the time this specification became a W3C Recommendationcited in the normative references (). However, implementors are encouraged to support the character properties defined in any future version.

The following table specifies the recognized values of the "General Category" property.

Category	Property	Meaning
Letters	L	All Letters
	Lu	uppercase
	Ll	lowercase
	Lt	titlecase
	Lm	modifier
	Lo	other

Marks	M	All Marks
	Mn	nonspacing
	Mc	spacing combining
	Me	enclosing

Numbers	N	All Numbers
	Nd	decimal digit
	Nl	letter
	No	other

Punctuation	P	All Punctuation
	Pc	connector
	Pd	dash
	Ps	open
	Pe	close
	Pi	initial quote (may behave like Ps or Pe depending on usage)
	Pf	final quote (may behave like Ps or Pe depending on usage)
	Po	other

Separators	Z	All Separators
	Zs	space
	Zl	line
	Zp	paragraph

Symbols	S	All Symbols
	Sm	math
	Sc	currency
	Sk	modifier
	So	other

Other	C	All Others
	Cc	control
	Cf	format
	Co	private use
	Cn	not assigned

Categories IsCategory | | | | | | Letters 'L' [ultmo]? Marks 'M' [nce]? Numbers 'N' [dlo]? Punctuation 'P' [cdseifo]? Separators 'Z' [slp]? Symbols 'S' [mcko]? Others 'C' [cfon]?

The properties mentioned above exclude the Cs property. The Cs property identifies surrogate characters, which do not occur at the level of the character abstraction that XML instance documents operate on.

groups code points into a number of blocks such as Basic Latin (i.e., ASCII), Latin-1 Supplement, Hangul Jamo, CJK Compatibility, etc. The set containing all characters that have block name X (with all white space stripped out), can be identified with a block escape \p{IsX}. The complement of this set is specified with the block escape \P{IsX}. ([\P{IsX}] = [^\p{IsX}]).

Block Escape IsBlock 'Is' [a-zA-Z0-9#x2D]+

The following table specifies the recognized block names (for more information, see the "Blocks.txt" file in ).

Start Code	End Code	Block Name	Start Code	End Code	Block Name
#x0000	#x007F	BasicLatin	#x0080	#x00FF	Latin-1Supplement
#x0100	#x017F	LatinExtended-A	#x0180	#x024F	LatinExtended-B
#x0250	#x02AF	IPAExtensions	#x02B0	#x02FF	SpacingModifierLetters
#x0300	#x036F	CombiningDiacriticalMarks	#x0370	#x03FF	Greek
#x0400	#x04FF	Cyrillic	#x0500	#x052F	CyrillicSupplement
#x0530	#x058F	Armenian	#x0590	#x05FF	Hebrew
#x0600	#x06FF	Arabic	#x0700	#x074F	Syriac
#x0750	#x077F	ArabicSupplement	#x0780	#x07BF	Thaana
#x0900	#x097F	Devanagari	#x0980	#x09FF	Bengali
#x0A00	#x0A7F	Gurmukhi	#x0A80	#x0AFF	Gujarati
#x0B00	#x0B7F	Oriya	#x0B80	#x0BFF	Tamil
#x0C00	#x0C7F	Telugu	#x0C80	#x0CFF	Kannada
#x0D00	#x0D7F	Malayalam	#x0D80	#x0DFF	Sinhala
#x0E00	#x0E7F	Thai	#x0E80	#x0EFF	Lao
#x0F00	#x0FFF	Tibetan	#x1000	#x109F	Myanmar
#x10A0	#x10FF	Georgian	#x1100	#x11FF	HangulJamo
#x1200	#x137F	Ethiopic	#x1380	#x139F	EthiopicSupplement
#x13A0	#x13FF	Cherokee	#x1400	#x167F	UnifiedCanadianAboriginalSyllabics
#x1680	#x169F	Ogham	#x16A0	#x16FF	Runic
#x1700	#x171F	Tagalog	#x1720	#x173F	Hanunoo
#x1740	#x175F	Buhid	#x1760	#x177F	Tagbanwa
#x1780	#x17FF	Khmer	#x1800	#x18AF	Mongolian
#x1900	#x194F	Limbu	#x1950	#x197F	TaiLe
#x1980	#x19DF	NewTaiLue	#x19E0	#x19FF	KhmerSymbols
#x1A00	#x1A1F	Buginese	#x1D00	#x1D7F	PhoneticExtensions
#x1D80	#x1DBF	PhoneticExtensionsSupplement	#x1DC0	#x1DFF	CombiningDiacriticalMarksSupplement
#x1E00	#x1EFF	LatinExtendedAdditional	#x1F00	#x1FFF	GreekExtended
#x2000	#x206F	GeneralPunctuation	#x2070	#x209F	SuperscriptsandSubscripts
#x20A0	#x20CF	CurrencySymbols	#x20D0	#x20FF	CombiningMarksforSymbols
#x2100	#x214F	LetterlikeSymbols	#x2150	#x218F	NumberForms
#x2190	#x21FF	Arrows	#x2200	#x22FF	MathematicalOperators
#x2300	#x23FF	MiscellaneousTechnical	#x2400	#x243F	ControlPictures
#x2440	#x245F	OpticalCharacterRecognition	#x2460	#x24FF	EnclosedAlphanumerics
#x2500	#x257F	BoxDrawing	#x2580	#x259F	BlockElements
#x25A0	#x25FF	GeometricShapes	#x2600	#x26FF	MiscellaneousSymbols
#x2700	#x27BF	Dingbats	#x27C0	#x27EF	MiscellaneousMathematicalSymbols-A
#x27F0	#x27FF	SupplementalArrows-A	#x2800	#x28FF	BraillePatterns
#x2900	#x297F	SupplementalArrows-B	#x2980	#x29FF	MiscellaneousMathematicalSymbols-B
#x2A00	#x2AFF	SupplementalMathematicalOperators	#x2B00	#x2BFF	MiscellaneousSymbolsandArrows
#x2C00	#x2C5F	Glagolitic	#x2C80	#x2CFF	Coptic
#x2D00	#x2D2F	GeorgianSupplement	#x2D30	#x2D7F	Tifinagh
#x2D80	#x2DDF	EthiopicExtended	#x2E00	#x2E7F	SupplementalPunctuation
#x2E80	#x2EFF	CJKRadicalsSupplement	#x2F00	#x2FDF	KangxiRadicals
#x2FF0	#x2FFF	IdeographicDescriptionCharacters	#x3000	#x303F	CJKSymbolsandPunctuation
#x3040	#x309F	Hiragana	#x30A0	#x30FF	Katakana
#x3100	#x312F	Bopomofo	#x3130	#x318F	HangulCompatibilityJamo
#x3190	#x319F	Kanbun	#x31A0	#x31BF	BopomofoExtended
#x31C0	#x31EF	CJKStrokes	#x31F0	#x31FF	KatakanaPhoneticExtensions
#x3200	#x32FF	EnclosedCJKLettersandMonths	#x3300	#x33FF	CJKCompatibility
#x3400	#x4DB5	CJKUnifiedIdeographsExtensionA	#x4DC0	#x4DFF	YijingHexagramSymbols
#x4E00	#x9FFF	CJKUnifiedIdeographs	#xA000	#xA48F	YiSyllables
#xA490	#xA4CF	YiRadicals	#xA700	#xA71F	ModifierToneLetters
#xA800	#xA82F	SylotiNagri	#xAC00	#xD7A3	HangulSyllables
		[See note following this table.]			[See note following this table.]
		[See note following this table.]	#xE000	#xF8FF	PrivateUse
#xF900	#xFAFF	CJKCompatibilityIdeographs	#xFB00	#xFB4F	AlphabeticPresentationForms
#xFB50	#xFDFF	ArabicPresentationForms-A	#xFE00	#xFE0F	VariationSelectors
#xFE10	#xFE1F	VerticalForms	#xFE20	#xFE2F	CombiningHalfMarks
#xFE30	#xFE4F	CJKCompatibilityForms	#xFE50	#xFE6F	SmallFormVariants
#xFE70	#xFEFEFF	ArabicPresentationForms-B	#xFEFF	#xFEFF	Specials
#xFF00	#xFFEF	HalfwidthandFullwidthForms	#xFFF0	#xFFFDFF	Specials
#x10000	#x1007F	LinearBSyllabary	#x10080	#x100FF	LinearBIdeograms
#x10100	#x1013F	AegeanNumbers	#x10140	#x1018F	AncientGreekNumbers
#x10300	#x1032F	OldItalic	#x10330	#x1034F	Gothic
#x10380	#x1039F	Ugaritic	#x103A0	#x103DF	OldPersian
#x10400	#x1044F	Deseret	#x10450	#x1047F	Shavian
#x10480	#x104AF	Osmanya	#x10800	#x1083F	CypriotSyllabary
#x10A00	#x10A5F	Kharoshthi	#x1D000	#x1D0FF	ByzantineMusicalSymbols
#x1D100	#x1D1FF	MusicalSymbols	#x1D200	#x1D24F	AncientGreekMusicalNotation
#x1D300	#x1D35F	TaiXuanJingSymbols	#x1D400	#x1D7FF	MathematicalAlphanumericSymbols
#x20000	#x2A6DF	CJKUnifiedIdeographsExtensionB	#x2F800	#x2FA1F	CJKCompatibilityIdeographsSupplement
#xE0000	#xE007F	Tags	#xE0100	#xE01EF	VariationSelectorsSupplement
#xF0000	#xFFFFF	SupplementaryPrivateUseArea-A	#x100000	#x10FFFF	SupplementaryPrivateUseArea-B

The blocks mentioned above exclude the HighSurrogates, LowSurrogates and HighPrivateUseSurrogates blocks. These blocks identify "surrogate" characters, which do not occur at the level of the "character abstraction" that XML instance documents operate on.

is subject to future revision. For example, the grouping of code points into blocks might be updated. All processors support the blocks defined in the version of that is current at the time this specification became a W3C Recommendationcited in the normative references (). However, implementors are encouraged to support the blocks defined in any future version of the Unicode Standard.

For example, the for identifying the ASCII characters is \p{IsBasicLatin}.

A multi-character escape provides a simple way to identify a commonly used set of characters:

Multi-Character Escape MultiCharEsc '\' [sSiIcCdDwW] WildcardEsc '.'

Character sequence Equivalent

. [^\n\r]

\s [#x20\t\n\r]

\S [^\s]

\i the set of initial name characters, those matched by NameStartChar in or by Letter | '_' | ':' in

\I [^\i]

\c the set of name characters, those matched by NameChar

\C [^\c]

\d \p{Nd}

\D [^\d]

\w [#x0000-#x10FFFF]-[\p{P}\p{Z}\p{C}] (all characters except the set of "punctuation", "separator" and "other" characters)

\W [^\w]

Character sequence	Equivalent
.	[^\n\r]
\s	[#x20\t\n\r]
\S	[^\s]
\i	the set of initial name characters, those matched by NameStartChar in or by Letter \| '_' \| ':' in
\I	[^\i]
\c	the set of name characters, those matched by NameChar
\C	[^\c]
\d	\p{Nd}
\D	[^\d]
\w	[#x0000-#x10FFFF]-[\p{P}\p{Z}\p{C}] (all characters except the set of "punctuation", "separator" and "other" characters)
\W	[^\w]

The language defined here does not attempt to provide a general solution to "regular expressions" over UCS character sequences. In particular, it does not easily provide for matching sequences of base characters and combining marks. The language is targeted at support of "Level 1" features as defined in . It is hoped that future versions of this specification will provide support for "Level 2" features.

Changes since version 1.0 Datatypes and Facets

In order to align this specification with those being prepared by the XSL and XML Query Working Groups, a new datatype named has been introduced; it serves as the base type definition for all datatypes has been introduced.

The treatment of datatypes has been made more precise and explicit; most of these changes affect the section on . Definitions have been revised thoroughly and technical terms are used more consistently.

The (numeric) equality of values is now distinguished from the identity of the values themselves; this allows and to treat positive and negative zero as distinct values for purposes of enumeration, but nevertheless to treat them as equal for purposes of bounds checking. This allows a better alignment with the expectations of users working with IEEE floating-point binary numbers.

The of the component for list datatypes is now always false, reflecting the fact that no ordering is prescribed for datatypes, and so they cannot be bounded using the facets defined by this specification.

Units of length have been specified for all datatypes that are permitted the length constraining facet.

The use of the namespace http://www.w3.org/2001/XMLSchema-datatypes has been deprecated. The definition of a namespace separate from the main namespace defined by this specification proved not to be necessary or helpful in facilitating the use, by other specifications, of the datatypes defined here, and its use raises a number of difficult unsolved practical questions.

Numerical Datatypes

The datatype has been added. It is intended to support the floating-point decimal datatypes defined in the forthcoming version of IEEE 754. The datatype differs from in that values carry not only a numeric value but also an (arithmetic) precision.

As noted above, positive and negative zero, and are now treated as distinct but arithmetically equal values.

The description of the lexical spaces of , , , and has been revised to agree with the schema for schemas by allowing for the possibility of a leading sign.

The and datatypes now follow IEEE 754 implementation practice more closely; in particular, negative and positive zero are now distinct values, although arithmetically equal. Conversely, NaN is identical but not arithmetically equal to itself.

The minimum requirements for implementation support of the datatype have been clarified.

Date/time Datatypes

The treatment of and related datatypes has been changed to provide a more explicit account of the value space in terms of seven numeric properties. The most important substantive change is that values now explicitly retain information about the time zone indicated in the lexical form; this allows better alignment with the treatment of such values in .

The treatment of the date/time datatype includes a carefully revised definition of order that ensures that for repeating datatypes (, , etc.), timezoned values will be compared as though they are on the same calendar day (local property values) so that in any given timezone, the days start at the local midnight and end just before local midnight. Days do not run from 00:00:00Z to 24:00:00Z in timezones other than Z.

The lexical representation 0000 for years is recognized and maps to the year 1 BCE; -0001 maps to 2 BCE, etc. This is a change from version 1.0 of this specification, in order to align with established practice (the so-called astronomical year numbering) and .

Algorithms for arithmetic involving and values have been provided, and corrections made to the function.

The treatment of leap seconds is no longer implementation-defined: the date/time types described here do not include leap-second values.

Other changes

Support has been added for version 1.1 and version 1.1. The datatypes which depend on and may now be used with the definitions provided by the 1.1 versions of those specifications, as well as with the definitions in the 1.0 versions. It is implementation-defined whether software conforming to this specification supports the definitions given in version 1.0, or in version 1.1, of and .

The account of the value space of has been changed to specify that values consist only of two numbers (the number of months and the number of seconds) rather than six (years, months, days, hours, minutes, seconds). This allows clearly equivalent durations like P2Y and P24M to have the same value.

Two new totally ordered restrictions of have been defined: , defined in , and , defined in . This allows better alignment with the treatment of durations in .

The XML representations of the and built-in datatypes have been moved out of the schema document for schema documents in and into a different appendix ().

Numerous minor corrections have been made in response to comments on earlier working drafts.

The treatment of topics handled both in this specification and in has been revised to align the two specifications more closely.

Several references to other specifications have been updated to refer to current versions of those specifications, including , , , , and .

Requirements for the datatype-validity of values of type have been clarified.

Explicit definitions have been provided for the lexical and canonical mappings of most of the primitive datatypes.

Some errors in the definition of regular-expression metacharacters have been corrected.

The descriptions of the and facets have been revised to make clearer how values from different derivation steps are combined.

A warning against using the whitespace facet for tokenizing natural-language data has been added on the request of the W3C Internationalization Working Group.

Glossary (non-normative)

The listing below is for the benefit of readers of a printed version of this document: it collects together all the definitions which appear in the document above.

An XSL macro is used to collect definitions from throughout the spec and gather them here for easy reference. References Normative World Wide Web Consortium. XML Base. Available at: http://www.w3.org/TR/2001/REC-xmlbase-20010627/ IEEE. IEEE Standard for Binary Floating-Point Arithmetic. See http://standards.ieee.org//reading/ieee/std_public/description/busarch/754-1985_desc.html International Telecommunication Union (ITU). Recommendation ITU-R TF.460-6: Standard-frequency and time-signal emissions. [Geneva: ITU, February 2002.] World Wide Web Consortium. XML Linking Language (XLink). Available at: http://www.w3.org/TR/2001/REC-xlink-20010627/. Note: only the URI reference escaping procedure defined in Section 5.4 is normatively referenced. Extensible Markup Language (XML) 1.0, Second EditionThird Edition, Tim Bray et al., eds., W3C, 6 October 20004 February 2004. See http://www.w3.org/TR/2000/REC-xml-20001006http://www.w3.org/TR/2004/REC-xml-20040204 For details of the dependency of this specification on XML, see . Extensible Markup Language (XML) 1.1,, Tim Bray et al., eds., W3C, 15 April 2004. See http://www.w3.org/TR/xml11/ For details of the dependency of this specification on XML 1.1, see . XML Schema Version 1.1 Part 1: Structures. Available at: http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ http://www.w3.org/TR/2006/WD-xmlschema11-1-20060330/structures.html XML Schema Requirements , Ashok Malhotra and Murray Maloney, eds., W3C, 15 February 1999. See http://www.w3.org/TR/1999/NOTE-xml-schema-req-19990215 World Wide Web Consortium. Namespaces in XML. Available at: http://www.w3.org/TR/REC-xml-names/ For details of the dependency of this specification on Namespaces in XML 1.0, see . World Wide Web Consortium. Namespaces in XML 1.1. Available at: http://www.w3.org/TR/xml-names11/ For details of the dependency of this specification on Namespaces in XML 1.1, see . Tim Berners-Lee, et. al. RFC 2396: Uniform Resource Identifiers (URI): Generic Syntax. 1998. Available at: http://www.ietf.org/rfc/rfc2396.txt RFC 2732: Format for Literal IPv6 Addresses in URL's. 1999. Available at: http://www.ietf.org/rfc/rfc2732.txt N. Freed and N. Borenstein. RFC 2045: Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. 1996. Available at: http://www.ietf.org/rfc/rfc2045.txt H. Alvestrand, ed. RFC 3066: Tags for the Identification of Languages 1995. Available at: http://www.ietf.org/rfc/rfc3066.txt S. Josefsson, ed. RFC 3548: The Base16, Base32, and Base64 Data Encodings. July 2003. Available at: http://www.ietf.org/rfc/rfc3548.txt William D Clinger. How to Read Floating Point Numbers Accurately. In Proceedings of Conference on Programming Language Design and Implementation, pages 92-101. Available at: ftp://ftp.ccs.neu.edu/pub/people/will/howtoread.ps The Unicode Consortium. The Unicode Character Database. Available at: http://www.unicode.org/Public/3.1-Update/UnicodeCharacterDatabase-3.1.0.html The Unicode Consortium. Unicode Character Database. Revision 4.1.0, by Mark Davis and Ken Whistler, 2005-03-30. Available at: http://www.unicode.org/Public/4.1.0/ucd/UCD.html Non-normative T. Berners-Lee, R. Fielding, and L. Masinter, RFC 3986: Uniform Resource Identifier (URI): Generic Syntax. January 2005. Available at: http://www.ietf.org/rfc/rfc3986.txt M. Duerst and M. Suignard. RFC 3987: Internationalized Resource Identifiers (IRIs) . January 2005. Available at: http://www.ietf.org/rfc/rfc3987.txt XML Schema Requirements , Ashok Malhotra and Murray Maloney, eds., W3C, 15 February 1999. See http://www.w3.org/TR/1999/NOTE-xml-schema-req-19990215 M. Dürst and M. Suignard . Internationalized Resource Identifiers 2002. Available at: http://www.w3.org/International/iri-edit/draft-duerst-iri-04.txt http://www.w3.org/International/iri-edit/draft-duerst-iri-08.txt World Wide Web Consortium. Ruby Annotation. Available at: http://www.w3.org/TR/2001/WD-ruby-20010216 http://www.w3.org/TR/2001/REC-ruby-20010531 World Wide Web Consortium. Hypertext Markup Language, version 4.01. Available at: http://www.w3.org/TR/1999/REC-html401-19991224/ World Wide Web Consortium. XML Schema Language: Part 0 Primer. Available at: http://www.w3.org/TR/xmlschema-0/ Mark Davis. Unicode Regular Expression Guidelines, 1988. Available at: http://www.unicode.org/unicode/reports/tr18/ The Perl Programming Language. See http://www.perl.com/pub/language/info/software.html ISO (International Organization for Standardization). ISO/IEC 9075-2:1999, Information technology --- Database languages --- SQL --- Part 2: Foundation (SQL/Foundation). [Geneva]: International Organization for Standardization, 1999. See http://www.iso.ch/cate/d26197.html.org/iso/en/ISOOnline.frontpage International Earth Rotation Service (IERS). See http://maia.usno.navy.mil ISO (International Organization for Standardization). Representations of dates and times, 1988-06-15. ISO (International Organization for Standardization). Representations of dates and times, draft revision, 1998. ISO (International Organization for Standardization). Representations of dates and times, second edition, 2000-12-15. ISO (International Organization for Standardization). Language-independent Datatypes. See http://www.iso.ch/cate/d19346.html.org/iso/en/ISOOnline.frontpage World Wide Web Consortium. RDF Schema Specification. Available at: http://www.w3.org/TR/2000/CR-rdf-schema-20000327/ World Wide Web Consortium. XQuery 1.0 and XPath 2.0 Functions and Operators, ed. Ashok Malhotra, Jim Melton, and Norman Walsh. W3C Candidate Recommendation 3 November 2005. Available at: http://www.w3.org/TR/2005/CR-xpath-functions-20051103/. N. Freed and N. Borenstein. RFC 2045: Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. 1996. Available at: http://www.ietf.org/rfc/rfc2045.txt Information about Leap Seconds Available at: http://tycho.usno.navy.mil/leapsec.html U.S. Naval Observatory Time Service Department, Historical list of leap seconds Available at: ftp://maia.usno.navy.mil/ser7/tai-utc.dat World Wide Web Consortium. Extensible Stylesheet Language (XSL). Available at: http://www.w3.org/TR/2000/CR-xsl-20001121 http://www.w3.org/TR/2001/REC-xsl-20011015/ Martin J. Dürst and François Yergeau, eds. Character Model for the World Wide Web. World Wide Web Consortium Working Draft 1.0: Fundamentals. 2001. Available at: http://www.w3.org/TR/2004/WD-charmod-20040225 David M. Gay. Correctly Rounded Binary-Decimal and Decimal-Binary Conversions. AT&T Bell Laboratories Numerical Analysis Manuscript 90-10, November 1990. Available at: http://cm.bell-labs.com/cm/cs/doc/90/4-10.ps.gz H. Alvestrand, ed. RFC 3066: Tags for the Identification of Languages 1995. Available at: http://www.ietf.org/rfc/rfc3066.txt Acknowledgements (non-normative)

TheAlong with the editors thereof, the following have contributed material to the first editionversion of this specification:

Asir S. Vedamuthu, webMethods, Inc Mark Davis, IBM

Co-editor Ashok Malhotra's work on this specification from March 1999 until February 2001 was supported by IBM, and from then until May 2004 by Microsoft. Since July 2004 his work on this specification has been supported by Oracle Corporation.

The XML Schema Working Group acknowledges with thanks the members of other W3C Working Groups and industry experts in other forums who have contributed directly or indirectly to the creation of this document and its predecessor.

At the time this Working Draft is published, the members in good standing of the XML Schema Working Group are:

Leonid Arbouzov Sun Microsystems Peter Chen Bootstrap Alliance and LSU David Ezell National Association of Convenience Stores chair Shudi (Sandy) Gao IBM Mary Holstege Mark Logic Kohsuke Kawaguchi Sun Microsystems Ashok Malhotra Oracle Corporation Noah Mendelsohn IBM Ravi Murthy Oracle Corporation Dave Peterson Invited Expert Anli Shundi TIBCO Extensibility C. M. Sperberg-McQueen W3C staff contact Hoylen Sue Distributed Systems Technology Centre (DSTC Pty Ltd) Henry S. Thompson University of Edinburgh Kongyi Zhou Oracle Corp.

The XML Schema Working Group has benefited in its work from the participation and contributions of a number of people who are no longer members of the Working Group in good standing at the time of publication of this Working Draft. Their names are given below. In particular we note with sadness the accidental death of Mario Jeckle shortly before publication of the first Working Draft of XML Schema 1.1. Affiliations given are those current at the time of their (first) work with the WG.

Paula Angerstein Vignette Corporation Jim Barnette Defense Information Systems Agency (DISA) David Beech Oracle Corp. Gabe Beged-Dov Rogue Wave Software Laila Benhlima Ecole Mohammadia d'Ingenieurs Rabat (EMI) Doris Bernardini Defense Information Systems Agency (DISA) Paul V. Biron Health Level Seven Don Box DevelopMentor Allen Brown Microsoft Lee Buck TIBCO Extensibility Greg Bumgardner Rogue Wave Software Dean Burson Lotus Development Corporation Charles E. Campbell Invited expert Oriol Carbo University of Edinburgh Wayne Carr Intel Tyng-Ruey Chuang Academia Sinica Tony Cincotta NIST David Cleary Progress Software Mike Cokus MITRE Dan Connolly W3C staff contact Ugo Corda Xerox Roger L. Costello MITRE Joey Coyle Health Level Seven Haavard Danielson Progress Software Josef Dietl Mozquito Technologies Kenneth Dolson Defense Information Systems Agency (DISA) Andrew Eisenberg Progress Software Rob Ellman Calico Commerce Tim Ewald Developmentor Alexander Falk Altova GmbH David Fallside IBM George Feinberg Object Design Dan Fox Defense Logistics Information Service (DLIS) Charles Frankston Microsoft Matthew Fuchs Commerce One Andrew Goodchild Distributed Systems Technology Centre (DSTC Pty Ltd) Xan Gregg TIBCO Extensibility Paul Grosso Arbortext, Inc Martin Gudgin DevelopMentor Ernesto Guerrieri Inso Dave Hollander Hewlett-Packard Company co-chair Nelson Hung Corel Jane Hunter Distributed Systems Technology Centre (DSTC Pty Ltd) Michael Hyman Microsoft Renato Iannella Distributed Systems Technology Centre (DSTC Pty Ltd) Mario Jeckle DaimlerChrysler Rick Jelliffe Academia Sinica Marcel Jemio Data Interchange Standards Association Simon Johnston Rational Software Dianne Kennedy Graphic Communications Association Janet Koenig Sun Microsystems Setrag Khoshafian Technology Deployment International (TDI) Melanie Kudela Uniform Code Council Ara Kullukian Technology Deployment International (TDI) Andrew Layman Microsoft Dmitry Lenkov Hewlett-Packard Company Bob Lojek Mozquito Technologies John McCarthy Lawrence Berkeley National Laboratory Matthew MacKenzie XML Global Murata Makoto Xerox Eve Maler Sun Microsystems Murray Maloney Muzmo Communication, acting for Commerce One Lisa Martin IBM Jim Melton Oracle Corp Adrian Michel Commerce One Alex Milowski Invited Expert Don Mullen TIBCO Extensibility Chris Olds Wall Data Frank Olken Lawrence Berkeley National Laboratory Paul Pedersen Mark Logic Corporation Shriram Revankar Xerox Mark Reinhold Sun Microsystems Jonathan Robie Software AG Cliff Schmidt Microsoft John C. Schneider MITRE Eric Sedlar Oracle Corp. Lew Shannon NCR William Shea Merrill Lynch Jerry L. Smith Defense Information Systems Agency (DISA) John Stanton Defense Information Systems Agency (DISA) Tony Stewart Rivcom Bob Streich Calico Commerce William K. Stumbo Xerox Ralph Swick W3C John Tebbutt NIST Ross Thompson Contivo Matt Timmermans Microstar Jim Trezzo Oracle Corp. Steph Tryphonas Microstar Mark Tucker Health Level Seven Asir S. Vedamuthu webMethods, Inc Scott Vorthmann TIBCO Extensibility Priscilla Walmsley XMLSolutions Norm Walsh Sun Microsystems Cherry Washington Defense Information Systems Agency (DISA) Aki Yoshida SAP AG

31	M = January, March, May, July, August, October, or December
30	M = April, June, September, or November
29	M = February AND (modulo(Y, 400) = 0 OR (modulo(Y, 100) != 0) AND modulo(Y, 4) = 0)
28	Otherwise