<?xml version='1.0'?>
<!-- Id: datatypes.xml,v 1.7.2.141 2005/02/22 07:13:38 cmsmcq Exp  -->
<?xml-stylesheet type='text/xsl' href='xmlschema_nodiffs.xsl'?>
<?xml-stylesheet type='text/xsl' href='xmlschema_nodiffs.xsl'?>
<!DOCTYPE spec SYSTEM "local.dtd" [
<!ENTITY suffix "">
<!ENTITY hellip "&#x2026;" ><!--=ellipsis (horizontal)-->
<!ENTITY isin   "&#x2208;" ><!--/in R: =set membership-->
<!ENTITY owners.Diff '<phrase diff="del" dg="wdd">parent&apos;s</phrase><phrase diff="add" dg="wdd">owner&apos;s</phrase>'>
]>
<spec w3c-doctype="wd" status="final">
<header>
<title>XML Schema 1.1 Part 2: Datatypes</title>
<w3c-designation>wd-20050224</w3c-designation>
<w3c-doctype>W3C Working Draft</w3c-doctype>
<pubdate>
<day>24</day>
<month>February</month>
<year>2005<!--* Id: datatypes.xml,v 1.7.2.141 2005/02/22 07:13:38 cmsmcq Exp  *--></year>
</pubdate>
<publoc> 
<loc href="http://www.w3.org/TR/2005/WD-xmlschema11-2-20050224/">http://www.w3.org/TR/2005/WD-xmlschema11-2-20050224/</loc> 
</publoc>
<altlocs>
<loc href="http://www.w3.org/TR/2005/WD-xmlschema11-2-20050224/datatypes.xml">XML</loc>
<!--* 
<loc href="http://www.w3.org/TR/2005/WD-xmlschema11-2-20050224/datatypes.diff-1.0.html">XHTML with changes since version 1.0 marked</loc>
<loc href="http://www.w3.org/TR/2005/WD-xmlschema11-2-20050224/datatypes.diff-wd.html">XHTML with changes since previous Working Draft marked</loc>
*-->
<loc href="http://www.w3.org/TR/2005/WD-xmlschema11-2-20050224/datatypes.diff-1.0.html">XHTML with changes since version 1.0 marked</loc>
<loc href="http://www.w3.org/TR/2005/WD-xmlschema11-2-20050224/datatypes.diff-wd.html">XHTML with changes since previous Working Draft marked</loc>
<loc href="http://www.w3.org/2001/XMLSchema.xsd">Independent copy of the schema for schema documents</loc>
<loc href="http://www.w3.org/2001/XMLSchema-datatypes.xsd">A schema for built-in datatypes only, in a separate namespace</loc>
<loc href="http://www.w3.org/2001/XMLSchema.dtd">Independent copy of the DTD for schema documents</loc>
<loc href="http://www.w3.org/2003/03/Translations/byTechnology?technology=xmlschema">List of translations</loc>
</altlocs>
<latestloc>
<loc href="http://www.w3.org/TR/xmlschema11-2/">http://www.w3.org/TR/xmlschema11-2/</loc>
</latestloc>
<prevlocs>
<loc href="http://www.w3.org/TR/2004/WD-xmlschema11-2-20040716/">http://www.w3.org/TR/2004/WD-xmlschema11-2-20040716/</loc>
</prevlocs>
<authlist>
<author>
<name>David Peterson</name>
<affiliation>invited expert (SGML<emph>Works!</emph>)</affiliation>
<email href="mailto:davep@iit.edu">davep@iit.edu</email>
</author>
<author role="1.0">
<name>Paul V. Biron</name>
<affiliation>Kaiser Permanente, for Health Level Seven</affiliation>
<email href="mailto:Paul.V.Biron@kp.org">Paul.V.Biron@kp.org</email>
</author>
<author>
<name>Ashok Malhotra</name>
<affiliation>Oracle Corporation</affiliation>
<email href="mailto:ashokmalhotra@alum.mit.edu">ashokmalhotra@alum.mit.edu</email>
</author>
<author diff="add">
<name>C. M. Sperberg-McQueen</name>
<affiliation>World Wide Web Consortium</affiliation>
<email href="mailto:cmsmcq@w3.org">cmsmcq@w3.org</email>
</author>
</authlist>
<status>
<p><emph>This section describes the status of this document at the
time of its publication. Other documents may supersede this document.
A list of current W3C publications and the latest revision of this
technical report can be found in the <loc href="http://www.w3.org/TR/">W3C technical reports index</loc> at
http://www.w3.org/TR/.</emph></p>
<p>This is a 
Public Working Draft of XML Schema 1.1.  It is here made
available for review by W3C members and the public.  It is intended to
give an indication of the W3C XML Schema Working Group's intentions
for this new version of the XML Schema language and our progress in
achieving them.  It attempts to be complete in indicating
<emph>what</emph> will change from version 1.0, but does
<emph>not</emph> specify in all cases <emph>how</emph> things will
change.</p>
<p>For those primarily interested in the changes since version 1.0,
the <specref ref="changes"/> appendix, which summarizes
both changes already made and also those in prospect, with links to
the relevant sections of this draft, is the recommended starting
point.  Accompanying versions of this document display in color
all changes to normative text since version 1.0 and since the
previous Working Draft.</p>
<p>This draft was published on 24&#x20;February&#x20;2005.
The major changes are:</p>
<ulist>
<item>
<p>A new primitive decimal type has been defined, which retains
information about the precision of the value.  This type is
aligned with the floating-point decimal types which will be
part of the next edition of IEEE 754.</p>
</item>
<item>
<p>In order to align this specification with those being prepared
by the XSL and XML Query Working Groups, a new datatype named
<dtref ref="anyAtomicType"/> has been introduced.</p>
</item>
<item>
<p>The conceptual model of the date- and time-related types has
been defined more formally.</p>
</item>
<item>
<p>Two subtypes of <dtref ref="duration"/> 
(<dtref ref="yearMonthDuration"/> and
<dtref ref="dayTimeDuration"/>) have been introduced, each of which is
totally ordered.</p>
</item>
<item>
<p>A more formal treatment of the fundamental facets of the primitive
datatypes has been adopted.</p>
</item>
<item>
<p>More formal definitions of the lexical space of most types have
been provided, with detailed descriptions of the mappings from lexical
representation to value and from value to canonical representation.</p>
</item>
<!--* 
<item>
<p>Canonical representations have been defined for the <dtref
ref="float"/> and <dtref ref="double"/> types.</p>
</item>
<item>
<p>The units of length have been specified for all primitive
datatypes.</p>
</item>
*-->
</ulist>

<p>Please send comments on this Working Draft to 
<loc href="mailto:www-xml-schema-comments@w3.org">www-xml-schema-comments@w3.org</loc> 
(<loc href="http://lists.w3.org/Archives/Public/www-xml-schema-comments/">archive</loc>).</p>
<p>Publication as a Working Draft does not imply endorsement by the
W3C Membership. This is a draft document and may be updated, replaced
or obsoleted by other documents at any time. It is inappropriate to
cite this document as other than work in progress.</p>

<p>
This document has been produced by the 
<loc href="http://www.w3.org/XML/Schema">W3C XML Schema Working Group</loc>
as part of the W3C <loc href="http://www.w3.org/XML/Activity">XML
Activity</loc>. The goals of the XML Schema language version 1.1 are
discussed in the <loc href="http://www.w3.org/TR/2003/WD-xmlschema-11-req-20030121/">Requirements 
for XML Schema 1.1</loc> document. The authors of this document are
the members of the XML Schema Working Group.  Different parts of this
specification have different editors.
</p>
<p>Patent disclosures relevant to this specification may
be found on the Working Group's <loc role="disclosure" href="http://www.w3.org/2004/01/pp-impl/19482/status">Patent
disclosure page</loc> in conformance with the <loc href="http://www.w3.org/Consortium/Patent-Policy-20040205/">W3C Patent
Policy</loc> of 5 February 2004.  An individual who has actual
knowledge of a patent which the individual believes contains Essential
Claim(s) with respect to this specification should disclose the
information in accordance with <loc href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section 
6 of the W3C Patent Policy</loc>.</p>
      
<!--* <p>In accordance with 
<loc href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Exclusion">section 
4 of the W3C Patent Policy</loc>, Working Group participants have 150
days from the title page date of this document to exclude essential
claims from the W3C RF licensing requirements with respect to this
document series. Exclusions are with respect to the exclusion
reference document, defined by the <loc href="http://www.w3.org/Consortium/Patent-Policy-20040205/">W3C Patent
Policy</loc> to be the latest version of a document in this series
that is published no later than 90 days after the title page date of
this document.</p> *-->

<p>The English version of this specification is the only normative
version. Information about translations of this document is available
at <loc href="http://www.w3.org/2003/03/Translations/byTechnology?technology=xmlschema"
>http://www.w3.org/2003/03/Translations/byTechnology?technology=xmlschema</loc>.</p>

</status>

<abstract>
<p>
<emph>XML Schema: Datatypes</emph> is part 2 of the specification of the XML
Schema language. It defines facilities for defining datatypes to be used
in XML Schemas as well as other XML specifications.
The datatype language, which is itself represented in
XML<phrase diff="del"> 1.0</phrase>, provides a superset of the capabilities found in XML<phrase diff="del"> 1.0</phrase>
document type definitions (DTDs) for specifying datatypes on elements
and attributes.
<issue id="RQ-152i" role="1.1">
  <p><loc href="&reqs;#xml1.1" target="reqs">RQ-152 (xml1.1)</loc></p>
  <p>How should this specification be aligned with XML 1.1?  The changes in
character set and name characters, and the question of what determines which
ones to use, must be addressed.</p>
 </issue></p>
</abstract>
<langusage>
<language id="EN">English</language>
      <language id="ebnf">Extended Backus-Naur Form (formal grammar)</language>
</langusage>
<revisiondesc>
<slist>
<sitem id="junk">diff group junk:&nbsp; a few homeless targets; should probably ALWAYS BE SHOW unless nothing is, or it is empty</sitem>
<sitem id="fa1">diff group fa1:&nbsp; RQ-24 facets proposal, changes made BEFORE
the publication of the first public working draft.  APPROVED SOME TELECON 2004-10</sitem>
<sitem id="fa1.z">diff group fa1:&nbsp; RQ-24 facets proposal, changes made AFTER the publication
of the first public working draft. APPROVED SOME TELECON 2004-10</sitem>
<sitem id="cvs1">diff group cvs1:&nbsp; Constructed Values Appendix (div1)</sitem>
<sitem id="cvs1_pwd">diff group cvs1_pwd:  Constructed Values Appendix as a whole (to
avoid nested like-named diffs)</sitem>
<sitem id="num1">diff group num1:&nbsp; Numerical Values Appendix (div2); requires cvs1</sitem>
<sitem id="numap1">diff group numap1:&nbsp; in-text productions, etc., first cut; requires funbase, nu1, num1</sitem>
<sitem id="funbase">diff group funbase:&nbsp; The functions appendix in its entirety.  ALWAYS ACCEPT OR SHOW</sitem>
<sitem id="nu1">diff group nu1:&nbsp; basic numerical functions; requires funbase, num1, cvs1</sitem>
<sitem id="du0">diff group du0:&nbsp; first Ph 2 for duration; requires numap, nu1, num1, funbase. NOT YET MARKED;  APPROVED pre-FPWD</sitem>
<sitem id="du1">diff group du1:&nbsp; second set of revs for duration (compare du2)</sitem>
<sitem id="du2">diff group du2:&nbsp; second set of revs for dayTimeDuration and yearMonthDuration (compare du1)</sitem>
<sitem id="dt1">diff group dt1:&nbsp; RQ-13 date/time rewrite, first part Ph 2 (d/t app and gDay); requires funbase, nu1, num1; APPROVED 2004-08-27 FTF</sitem>
<sitem id="dt2">diff group dt2:&nbsp; RQ-13 date/time rewrite, second part Ph 2 (time and others); requires dt1, funbase, nu1, num1</sitem>
<sitem id="dtr">diff group dtr:&nbsp; date/time nonnormative description (INCLUDES 2 NORMATIVE TABLES); requires dt1</sitem>
<sitem id="dt3">diff group dt3:&nbsp; RQ-13 date/time rewrite, third part Ph 2 (time and others); requires dt1, dt2, funbase, nu1, num1</sitem>
<sitem id="dt2-3">diff group dt2-3:&nbsp; RQ-13 date/time rewrite, third part Ph 2 (time and others); marks an item added indt2 and then delled in dt3 as del. Accept ("post"), except reject ("pre") if dt2 is accept and dt3 is reject, and show ("colour") if dt2 is accept and dt3 is show.</sitem>
<sitem id="dt4">diff group dt4:&nbsp; RQ-13 date/time rewrite, fourth part Ph 2 (time and others); requires dt1, dt2, dt3, funbase, nu1, num1</sitem>
<sitem id="pd1">diff group pd1:&nbsp; RQ-31 precisionDecimal first cut for approval; co-requires pre, pd2, pd3; requires pdf</sitem>
<sitem id="pdo">diff group pdo:&nbsp; RQ-31 precisionDecimal first cut, 
deletion of old decimal; co-requires pre, pd1 ,pd3; requires pdf.
2005-01-20: WG chooses two-primitive approach, rejects this change.
2005-01-26: MSM removes this diff group to reduce cruft in the document.
</sitem>
<sitem id="pd2">diff group pd2:&nbsp; RQ-31 precisionDecimal first cut, 
addition of new aPDedimal; co-requires pre, pd1, pd2; requires pdf.
2005-01-20: WG chooses two-primitive approach, rejects this change.
2005-01-26: MSM removes this diff group to reduce cruft in the document.
</sitem>
<sitem id="pre">diff group pre:&nbsp; Precision Appendix; co-requires pd1, 
requires num1 and cvs1.
Final wording approved (with changes) 2005-02-04.</sitem>
<sitem id="pdf">diff group pdf:&nbsp; numerical functions just for 
precisionDecimal (RQ-31); requires num1 (??).
Final wording approved (with changes) 2005-02-04.</sitem>
<sitem id="pdf_m">diff group pdf:&nbsp; numerical functions for 
precisionDecimal (RQ-31) in two-primitive form.
Final wording approved (with changes) 2005-02-04.</sitem>
<sitem id="pdf_u">diff group pdf:&nbsp; numerical functions for precisionDecimal (RQ-31) 
in single-primitive form.  Removed 2005-01-26 after WG chose two-primitive form.</sitem>
<sitem id="aat">diff group aat:&nbsp; anyAtomicType (RQ-???); may require fa1 ??    
APPROVED with changes FTF 2004-11-10.
Changes decided by WG entered (as aatf), 2005-01-25.
Draft final wording approved (with changes) 2005-02-04.
</sitem>
<sitem id="aat1">diff group aat1:&nbsp; anyAtomicType (RQ-???); requires aat</sitem>
<sitem id="trm1">diff group trm1:&nbsp; terminological cleanup begun with tightening meaning of derived (RQ-120); </sitem>
<sitem id="rq31facets">diff group rq31facets: with MSM's proposed changes related to facets of
precision decimal.  This takes a single-primitive ('unitarian') view of
precision decimal and legacy decimal (here under the name aPdecimal).
Compatible with both rq31m and rq31u.</sitem>
<sitem id="rq31u">diff group rq31u: with changes for a one-primitive ('unitarian')
version of precision decimal.  Incompatible with: 
rq31m, which takes the manichean view,
Assumes: pd1, pd2, pre, pdf, num1, pdo(which deletes old decimal),
pd2 (which inserts new aPDecimal).
The WG chose the Manichean decimal proposal over the Unitarian one,
2005-01-20.  Diffs for group rq31u were removed 2005-01-26.
</sitem>
<sitem id="rq31m">diff group rq31m: with changes for a two-primitive ('manichean')
version of precision decimal.  Incompatible with: 
rq31u, which takes the unitarian view,
pdo, which deletes old decimal,
pd2, which inserts new aPDecimal.
Assumes: pd1, pre, pdf, num1.
Final wording approved (with changes) 2005-02-04.
</sitem>
<sitem id="fa1-fix">diff group fa1-fix: MSM's proposed changes for fixing
problems (missing term definitions, in particular) caused by the fact
that fa1 was incomplete and left the document in an unstable
state.</sitem>
<sitem id="iff">diff group iff: with an editorial proposal (2005-01-01) for
being more consistent about the use of conditionals and
biconditionals.  When terms are being defined (whether or not marked
as termdefs) or necessary and sufficient conditions for some state are
being given (e.g. in constraint notes, which define terms like 'facet
valid with respect to X'), this diff group proposes to use 'if' only
for conditions which are sufficient but not necessary; if the
conditions are both sufficient and necessary, then use 'if and only
if'.</sitem>
<sitem id="pdf_tweak">diff group pdf_tweak: for proposed improvements to diff
group pdf (all gone away now, and then come back again).
Final wording approved (with changes) 2005-02-04. </sitem>
<sitem id="review">diff group review: for marking stuff that is really intended
only for editorial review (usually to be used on ednotes).</sitem>
<sitem id="wdd">diff group wdd: for working-draft deviations:  changes
between the publication of the first public WD in July and the
advent of thorough and permanent change markup.  (Diff group wdd
begun 9 January 2005, but diff not completed.  It was looking like
another three hours work.)  I.e. wdd should mark all and only those
differences between TR/2004/WD-xmlschema11-2-20040716/datatypes.xml
and xse/datatypes/datatypes.xml which are not already marked.  When
we run the result through the dg.xsl filter with wdd set to reject,
the result should be (modulo whitespace and other non-significant
differences) substantively the same as the public WD.
</sitem>
<sitem id="dpno">diff group dpno: change proposals transferred
into this file from the experimental fork datatypes.newOrg.xml.
At the moment, the quasi-systematic changes of ID have not been
reproduced.</sitem>
<sitem id="fpwd-rescinded-add">diff group fpwd-rescinded-add: marks some paragraphs added in the first public working draft but
since deleted again.</sitem>
<sitem id="fpwd-rescinded-del">diff group fpwd-rescinded-del:
marks some paragraphs marked as deleted in the first public working draft but
since restored.</sitem>
<sitem id="aatf">diff group aatf: anyAtomicType (RQ-141).  Changes decided on
by WG at Redwood Shores ftf 2004-11-10.
Draft final wording approved (with changes) 2005-02-04.</sitem>
<sitem id="aatj">diff group aatj: anyAtomicType (RQ-141).  Proposal for change,
submitted to WG at Brisbane, January 2005 (hence the 'j').
Final wording approved (with changes) 2005-02-04.</sitem>
<sitem id="aatg">diff group aatg: anyAtomicType (RQ-141).  Changes to
correct errors found in review of aatf, including changes agreed
by WG in telcon of 2005-02-04 when the RQ-141 proposal was 
approved.</sitem>
<sitem id="vrd">diff group vrd: make validation rules declarative.  
Not yet complete.  Stems from rq31m edits:  first cut at editing
the upper and lower bounds facets included reformulation of the
validation rules to talk about numeric value.  When the order
relation for numeric values and pDecimal values was defined, however,
it became clear that the validation rules didn't need that change,
and the remaining change (making them declarative) didn't really
have anything to do with anyAtomicType.</sitem>
<sitem id="fpwd">diff group fpwd: used to mark things that changed
between 1.0 2E and the first public working draft of July 2004.
(N.B. issues elements and editorial notes are not consistently
marked as added.  They may consistently be unmarked.)</sitem>
<sitem id="rq001">diff group rq001: marks a phase-2 proposal to resolve
requirement RQ-001, adopted by the WG on 2 March 2004.</sitem>
<sitem id="rq31fix">diff group rq31fix: marks some wording changes
intended to address problems identified by Dave Peterson,
Sandy Gao, and Noah Mendelsohn after the draft final wording 
for RQ-31 went to the WG.</sitem>
<sitem id="ep01">Micro-component-related changes</sitem>
<sitem id="wd2hax">Last-minute hacks to make the Working Draft
of February 2005 be valid and produce valid clean HTML.</sitem>
</slist>
</revisiondesc>
</header>
<body>


<div1 role="1.0" id="Intro">
<head>Introduction</head>

<issue id="RQ-21i" role="1.1">
<p><loc href="&reqs;#bnf" target="reqs">RQ-21 (regex/BNF for all primitive types)</loc></p>
<p>Current plan is that all datatypes defined herein will have EBNF productions at least approximately defining their lexical space,
and will include a nonnormative regex derived from the EBNF if a user wishes to copy it directly.</p>
</issue>

<issue id="RQ-24-2i" role="1.1">
<p><loc href="&reqs;#fundamentals" target="reqs">RQ-24 (systematic facets: canonical representations for all datatypes)</loc></p>
<p>It is not possible for all datatypes to have canonical representations of all values without violating the rules of derivation
or adding special-purpose &cfacet;s which the WG does not deem appropriate.&nbsp; The WG has not yet decided how to deal with
datatypes whose lexical and/or canonical mappings are context sensitive.</p>
</issue>

<issue id="RQ-148i" role="1.1">
<p><loc href="&reqs;#Truncation-not-defined" target="reqs">RQ-148 (clarify use of "truncation)</loc></p>
<p>The word will probably be removed.</p>
</issue>

<issue id="RQ-120i" role="1.1">
<p><loc href="&reqs;#term-derived" target="reqs">RQ-120 (consistent use of "derived)</loc></p>
<p>"Derivations" other than "derivations by restriction" will be renamed "constructions".</p>
</issue>



<issue id="RQ-24-4i" role="1.1">
<p><loc href="&reqs;#fundamentals" target="reqs">RQ-24 (systematic facets: assignment of datatype to nodes without components)</loc></p>
</issue>
    <div2 id="intro1.1" diff="add" dg="fpwd">
   <head>Introduction to Version 1.1</head>
     <p>The Working Group has two main goals for this version of W3C XML Schema:</p>
     <ulist>
<item><p>Significant improvements in simplicity of design and clarity of
   exposition <emph>without</emph> loss of backward <emph>or</emph> forward compatibility;

 </p></item>
<item><p>Provision of support for versioning of XML languages defined using
   the XML Schema specification, including the XML transfer syntax for
   schemas itself.</p></item>
</ulist>
<p>These goals are slightly in tension with one another -- the following
summarizes the Working Group's strategic guidelines for changes
between versions 1.0 and 1.1:</p>
<olist>
<item><p>Add support for versioning (acknowledging that this <emph>may</emph>
    be slightly disruptive to the XML transfer syntax at the margins)</p></item>
<item><p>Allow bug fixes (unless in specific cases we decide that the fix
    is too disruptive for a point release)</p></item>
<item><p>Allow editorial changes</p></item>
<item><p>Allow design cleanup to change behavior in edge cases</p></item>
<item><p>Allow relatively non-disruptive changes to type hierarchy (to
    better support current and forthcoming international standards and
W3C recommendations)</p></item>
<item><p>Allow design cleanup to change component structure (changes
    to functionality restricted to edge cases)</p></item>
<item><p>Do not allow any significant changes in functionality</p></item>
<item><p>Do not allow any changes to XML transfer syntax except those
    required by version control hooks and bug fixes</p></item>
</olist>
<p>The overall aim as regards compatibility is that</p>

<ulist>
<item><p>All schema documents conformant to version 1.0 of this
    specification should also conform to version 1.1, and should have
    the same validation behaviour across 1.0 and 1.1 implementations
    (except possibly in edge cases and in the details of the resulting
    PSVI);</p></item>
<item><p>The vast majority of schema documents conformant to version 1.1 of
    this specification should also conform to version 1.0, leaving
    aside any incompatibilities arising from support for versioning,
    and when they are conformant to version 1.0 (or are made
    conformant by the removal of versioning information), should have
    the same validation behaviour across 1.0 and 1.1 implementations
    (again except possibly in edge cases and in the details of the
    resulting PSVI);
 </p></item>
</ulist>
    </div2>
      <div2 role="1.0" id="purpose">
<head>Purpose</head>
<p>
The <bibref ref="XML"/> specification defines limited
facilities for applying datatypes to document content in that documents
may contain or refer to DTDs that assign types to elements and attributes.
However, document authors, including authors of traditional
<emph>documents</emph> and those transporting <emph>data</emph> in XML,
often require a higher degree of type checking to ensure robustness in
document understanding and data interchange.
</p>
<p>
The table below offers two typical examples of XML instances
in which datatypes are implicit: the instance on the left
represents a billing invoice, the instance on the
right a memo or perhaps an email message in XML.
</p>
<table class="dtdemo" border="1">
<thead>
<tr>
<th>Data oriented</th>
<th>Document oriented</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<eg><![CDATA[<invoice>
  <orderDate>1999-01-21</orderDate>
  <shipDate>1999-01-25</shipDate>
  <billingAddress>
   <name>Ashok Malhotra</name>
   <street>123 Microsoft Ave.</street>
   <city>Hawthorne</city>
   <state>NY</state>
   <zip>10532-0000</zip>
  </billingAddress>
  <voice>555-1234</voice>
  <fax>555-4321</fax>
</invoice>]]></eg>
</td>
<td>
<eg><![CDATA[<memo importance='high'
      date='1999-03-23'>
  <from>Paul V. Biron</from>
  <to>Ashok Malhotra</to>
  <subject>Latest draft</subject>
  <body>
    We need to discuss the latest
    draft <emph>immediately</emph>.
    Either email me at <email>
    mailto:paul.v.biron@kp.org</email>
    or call <phone>555-9876</phone>
  </body>
</memo>]]></eg>
</td>
</tr>
</tbody>
</table>
<p>
The invoice contains several dates and telephone numbers, the postal
abbreviation for a state
(which comes from an enumerated list of sanctioned values), and a ZIP code
(which takes a definable regular form).&nbsp; The memo contains many
of the same types of information: a date, telephone number, email address
and an "importance" value (from an enumerated
list, such as "low", "medium" or "high").&nbsp; Applications which process
invoices and memos need to raise exceptions if something that was
supposed to be a date or telephone number does not conform to the rules
for valid dates or telephone numbers.
</p>
<p>
In both cases, validity constraints exist on the content of the
instances that are not expressible in XML DTDs.&nbsp; The limited datatyping
facilities in XML have prevented validating XML processors from supplying
the rigorous type checking required in these situations.&nbsp; The result
has been that individual applications writers have had to implement type
checking in an ad hoc manner.&nbsp; This specification addresses
the need of both document authors and applications writers for a robust,
extensible datatype system for XML which could be incorporated into
XML processors.&nbsp; As discussed below, these datatypes could be used in other
XML-related standards as well.
</p>
</div2>
<div2 role="1.0" id="requirements">
<head>Requirements</head>
<p>
The <bibref ref="schema-requirements"/> document spells out
concrete requirements to be fulfilled by this specification,
which state that the XML Schema Language must:
</p>
<olist>
<item>
<p>
provide for primitive data typing, including byte, date,
integer, sequence, SQL and Java primitive datatypes, etc.;
</p>
</item>
<item>
<p>
define a type system that is adequate for import/export
from database systems (e.g., relational, object, OLAP);
</p>
</item>
<item>
<p>
distinguish requirements relating to lexical data representation
vs. those governing an underlying information set;
</p>
</item>
<item>
<p>
allow creation of user-defined datatypes, such as
datatypes that are derived from existing datatypes and which
may constrain certain of its properties (e.g., range,
precision, length, format).
</p>
</item>
</olist>
</div2>
<div2 role="1.0" id="scope">
<head>Scope</head>
<p>
This portion of the XML Schema Language discusses datatypes that can be
used in an XML Schema.&nbsp; These datatypes can be specified for element
content that would be specified as
<xspecref href="&xmlspec;#dt-chardata">#PCDATA</xspecref> and attribute
values of <xspecref href="&xmlspec;#sec-attribute-types">various
types </xspecref> in a DTD.&nbsp; It is the intention of this specification
that it be usable outside of the context of XML Schemas for a wide range
of other XML-related activities such as <bibref ref="XSL"/> and
<bibref ref="RDFSchema"/>.
</p>
</div2>
<div2 role="1.0" id="terminology">
<head>Terminology</head>
<p>
The terminology used to describe XML Schema Datatypes is defined in the
body of this specification. The terms defined in the following list are
used in building those definitions and in describing the actions of a
datatype processor:
</p>
<glist>
<gitem>
<label>
<termdef id="dt-compatibility" term="for compatibility">
for compatibility</termdef>
</label>
<def>
<p>
A feature of this specification included solely to ensure that schemas
which use this feature remain compatible with <bibref ref="XML"/>
</p>
</def>
</gitem>
<gitem>
<label>
<termdef id="dt-may" term="may"><term>may</term></termdef>
</label>
<def>
<p>
Conforming documents and processors are permitted to but need
not behave as described.
</p>
</def>
</gitem>
<gitem>
<label>
<termdef id="dt-match" term="match"><term>match</term></termdef>
</label>
<def>
<p>
(Of strings or names:) Two strings or names being compared must be
identical. Characters with multiple possible representations in ISO/IEC 10646 (e.g.
characters with both precomposed and base+diacritic forms) match only if they have
the same representation in both strings. No case folding is performed. (Of strings and
rules in the grammar:) A string matches a grammatical production 
if <phrase diff="add" dg="iff">and only if</phrase> 
it belongs to the
language generated by that production.
</p>
</def>
</gitem>
<gitem>
<label>
 <termdef id="dt-must" term="must"><term>must</term></termdef>
</label>
<def>
<p>
Conforming documents and processors are required to behave as
described; otherwise they are in <termref def="dt-error">error</termref>.
</p>
</def>
</gitem>
<gitem>
<label>
<termdef id="dt-error" term="error"><term>error</term></termdef>
</label>
<def>
<p>
A violation of the rules of this specification; results are undefined.
Conforming software <termref def="dt-may"/> detect and report an
<term>error</term> and <termref def="dt-may"/> recover from it.
</p>
</def>
</gitem>
</glist>
</div2>

<div2 role="1.0" id="constraints-and-contributions">
<head>Constraints and Contributions</head>
<p>
This specification provides three different kinds of normative
statements about schema components, their representations in XML and
their contribution to the schema-validation of information items:
</p>
<glist>
<gitem>
<label>
<termdef id="dt-cos" term="Constraint on Schemas">
<term>Constraint on Schemas</term>
</termdef>
</label>
<def>
<p>
Constraints on the schema components themselves, i.e. conditions
components <termref def="dt-must"/> satisfy to be components at all.
Largely to be found in <specref ref="datatype-components"/>.
</p>
</def>
</gitem>
<gitem>
<label>
<termdef id="dt-src" term="Schema Representation Constraint">
<term>Schema Representation Constraint</term>
</termdef>
</label>
<def>
<p>
Constraints on the representation of schema components in XML.&nbsp; Some but
not all of these are expressed in <specref ref="schema"/> and
<specref ref="dtd-for-datatypeDefs"/>.
</p>
</def>
</gitem>
<gitem>
<label>
<termdef id="dt-cvc" term="Validation Rule">
<term>Validation Rule</term>
</termdef>
</label>
<def>
<p>
Constraints expressed by schema components which information
items <termref def="dt-must"/> satisfy to be schema-valid.&nbsp; Largely
to be found in <specref ref="datatype-components"/>.
</p>
</def>
</gitem>
</glist>
</div2>
</div1>

<div1 id="typesystem">
<head><phrase diff="del" dg="fa1">Type</phrase><phrase diff="add" dg="fa1">Datatype</phrase> System</head>

<!--ednote><edtext>I don't want to use the word <mention>type</mention> without some prefix or adjective.&emsp;&mdash;DP</edtext></ednote-->

<p>This section describes the conceptual framework behind the 
<phrase diff="add" dg="fa1">data</phrase>type system
defined in this specification.&nbsp; The framework has been influenced by the
<bibref ref="ISO11404"/> standard on language-independent datatypes as
well as the datatypes for <bibref ref="SQL"/> and for programming
languages such as Java.</p>

<!--ednote><edtext>Our datatypes are <emph>not</emph> <unusual>computer representations</unusual>.&nbsp; Our value spaces are the
abstract concepts; appropriate computer representations are determined by the implementers.</edtext></ednote-->

<p>The datatypes discussed in this specification are <phrase diff="del" dg="fa1">computer
representations of</phrase><phrase diff="add" dg="fa1">for the most part</phrase> well known abstract concepts such as
<emph>integer</emph> and <emph>date</emph>. It is not the place of this
specification to <phrase diff="add" dg="fa1">thoroughly </phrase>define these abstract concepts; many other publications
provide excellent definitions.<phrase diff="add" dg="fa1">  However, this specification will attempt to
describe the abstract concepts well enough that they can be readily recognized
and distinguished from other abstractions with which they may be confused.</phrase></p>

<note diff="add" dg="fa1">
<p>Only those operations and relations needed for schema processing are defined in this
specification. Applications using these datatypes are generally expected to implement
appropriate additional functions and/or relations to make the datatype generally
useful.&nbsp; For example, the description herein of the <dtref ref="float"/> datatype
does not define addition or multiplication, much less all of the operations defined for
that datatype in <bibref ref="ieee754"/> on which it is based.</p>
</note>

<div2 id="datatype">
<head>Datatype</head>
<!--* !!! newOrg assigns the id 'datatypes' to the section that contains
    * the following paragraphs.  At the moment, I have not followed that 
    * change.  -msm
    *-->
<p diff="del" dg="fa1">
<termdef id="del-dt-datatype" term="datatype">In this specification,
a <term>datatype</term> is a 3-tuple, consisting of
a) a set of distinct values, called its <termref def="dt-value-space"/>,
b) a set of lexical representations, called its
<termref def="dt-lexical-space"/>, and c) a set of <termref def="del-dt-facet"/>s
that characterize properties of the <termref def="dt-value-space"/>,
individual values or lexical items.
</termdef>
</p>



<p diff="add" dg="fa1"><termdef term="datatype" id="dt-datatype">In this specification, 
a <term>datatype</term> <phrase diff="del" dg="wdd">is a thing with four</phrase><phrase diff="add" dg="wdd">has three</phrase> properties</termdef>:

<ulist><item>
<p>A <termref def="dt-value-space"></termref>, which is 
<phrase diff="del" dg="wdd">simply </phrase>a set<phrase diff="add" dg="wdd"> of values</phrase>.
<phrase diff="del" dg="wdd">What the members of this set are called 
(beyond being generically called <quote>values</quote>)
is influenced by the set of value-space operations and relations used therewith.</phrase></p>
</item>
<item>
<p>A <termref def="dt-lexical-space"></termref>, which is <phrase diff="del" dg="wdd">the domain of the
<termref def="dt-lexical-mapping"></termref>.&nbsp; <phrase role="UNSURE">Some
<termref def="dt-lexical-mapping">lexical mappings</termref> are context sensitive,
so that the <termref def="dt-lexical-space"></termref> depends on the context in which the
lexical representation occurs.</phrase></phrase><phrase diff="add" dg="wdd">a set of &string;s used to denote the values</phrase>.</p>
</item>
<item>
<p>A small collection of <emph>functions, relations, and procedures</emph> associated with the datatype.&nbsp; Included
are equality and order relations on the <termref def="dt-value-space"></termref>, and a
<termref def="dt-lexical-mapping"></termref>, which is a function on the <termref def="dt-lexical-space"></termref>
onto the <termref def="dt-value-space"></termref>.</p>
</item>
<item diff="del" dg="wdd">
<!--* 2005-02-21, MSM changes the remaining two occurrences of 
    * <compref ref="dc-defn"/> to <compref ref="std"/>
    * so that the diffed display against 1.0 will work properly.
    * The target of the link, of course, is slightly different.
    *-->
<p>A <compref ref="std"/>, which serves to define and/or identify the datatype.</p>
</item>
</ulist>
</p>

<!--* !!! N.B. in the WD, the following note was in the penultimate list item, 
    * not after the list. We don't have good transposition markup, so I am 
    * leaving the movement unmarked.  -MSM *-->
<!--* <ednote diff="add" dg="wdd"><edtext>Do we want to delete the following Note?</edtext></ednote> *-->

<note>
<p>This specification only defines the operations and relations needed for schema processing.&nbsp; The
choice of terminology for describing/naming the datatypes is selected to guide users and implementers
in how to expand the datatype to be generally useful&mdash;i.e., how to recognize the <quote>real world</quote>
datatypes and their variants for which the datatypes defined herein are
meant to be used for data interchange.</p>
</note>

<p>Along with the <termref def="dt-lexical-mapping"></termref> it is often useful
to have an inverse which provides a standard <termref def="dt-lexical-representation"></termref> for
each value.&nbsp; Such a <termref def="dt-canonical-mapping"></termref> is not required for schema
processing, but is described herein for the benefit of users of this specification, and other
specifications which might find it useful to reference these descriptions normatively.</p>

</div2>

<div2 id="value-space"><head>Value space</head>

<p diff="del" dg="fa1"><termdef id="del-dt-value-space" term="value space">A <term>value
space</term> is the set of values for a given datatype.
Each value in the <term>value space</term> of a datatype is denoted by
one or more literals in its <termref def="dt-lexical-space"/>.
</termdef></p>

<p diff="add" dg="fa1"><termdef term="value space" id="dt-value-space">The <term>value space</term> <emph>of 
a datatype</emph> is the set of values for that datatype.</termdef>&nbsp; Associated
with each value space are selected operations and 
relations necessary to permit proper schema processing.&nbsp; Each value in the value space 
of a datatype is denoted by one or more character strings in its 
<termref def="dt-lexical-space"></termref>, according 
to <termref role="the" def="dt-lexical-mapping">the lexical mapping</termref>.&nbsp; (If
the mapping is restricted during a derivation in such a way 
that a value has no denotation, that value is dropped from the value space.)</p>

<p diff="add" dg="fa1">The value spaces of datatypes are abstractions,
and are defined in <specref ref="built-in-datatypes"/> 
<!--* n.b. newOrg deletes 'built-in-datatypes' and inserts a
    * section with ID builtinSTDs, changing some but not all
    * pointers to built-in-datatypes to the new ID.
    * For the moment, I've left all of them at 'built-in-datatypes'. -msm
    *-->
to the extent needed to clarify
them for readers.&nbsp; For example, in defining the numerical
datatypes, we assume some general numerical concepts such as number
and integer are known.&nbsp; In many cases we provide references to
other documents providing more complete definitions.</p>

<note diff="add" dg="fa1">
<p><emph>The value spaces and the values therein are abstractions.</emph>&nbsp; This specification does not 
prescribe any particular internal representations that must be used when implementing these datatypes.&nbsp; 
In some cases, there are references to other specifications which do prescribe specific internal 
representations; these specific internal representations must be used to comply with those other 
specifications, but need not be used to comply with this specification.</p>

<p>In addition, other applications are expected to define additional appropriate
operations and/or relations on these value spaces (e.g., addition and multiplication
on the various numerical datatypes&apos; value spaces), and are permitted where
appropriate to even redefine the operations and relations defined within this
specification, provided that <emph>for schema processing the relations and operations
used are those defined herein</emph>.</p>
</note>

<!--ednote><edtext>Could we do away with the following paragraph?&nbsp; Does it really add anything?</edtext></ednote-->

<p>The <termref def="dt-value-space"/> of a <phrase diff="del" dg="fa1">given </phrase>datatype can
be defined in one of the following ways:
<ulist>
<item><p>defined<phrase diff="add" dg="fa1"> elsewhere</phrase> axiomatically from fundamental notions
(intensional definition)
[see <termref def="dt-primitive"/>]</p>
</item>
<item><p>enumerated outright<phrase diff="add" dg="fa1"> from values of an already defined
datatype</phrase> (extensional definition)
[see <termref def="dt-enumeration"/>]</p>
</item>
<item><p>defined by restricting the <termref def="dt-value-space"/> of
an already defined datatype to a particular subset with a given set
of properties [see <termref def="dt-derived"/>]</p>
</item>
<item><p>defined as a combination of values from one or more already defined
<termref def="dt-value-space"/>(s) by a specific construction procedure
[see <termref def="dt-list"/> and <termref def="dt-union"/>]</p>
</item></ulist></p>

<p diff="del" dg="fa1">
<termref def="dt-value-space"/>s have certain properties.&nbsp; For example,
they always have the property of <termref def="dt-cardinality"/>,
some definition of <emph>equality</emph>
and might be <termref def="dt-ordered"/>, by which individual
values within the <termref def="dt-value-space"/> can be compared to
one another.&nbsp; The properties of <termref def="dt-value-space"/>s that
are recognized by this specification are defined in
<specref ref="del-fundamental-facets"/>.
</p>

<p diff="add" dg="fa1">The relations of <emph>identity</emph>, <emph>equality</emph>, and <emph>order</emph> are 
required for each value space.&nbsp; A very few datatypes have other relations or operations prescribed for the purposes of this 
specification.</p>

<div3 diff="add" dg="fa1" id="identity">
<head> Identity</head>

<!--* <ednote diff="add" dg="wdd"><edtext>IIRC, someone in the WG pointed out a third situation where identity is used, but I can't find any reference.</edtext></ednote> *-->
<p>The identity relation is always defined. Every value space inherently has an 
identity relation. Two things are 
<emph>identical</emph> 
if <phrase diff="add" dg="iff">and only if</phrase> 
they are actually the same thing: i.e., if there is no way whatever to 
tell them apart.&nbsp; The identity relation is used when making restrictions by <emph>enumeration</emph>, and when checking
identity constraints.&nbsp; These are the only uses of <emph>identity</emph> for schema processing.</p>

<note>
<p>This does not preclude implementing datatypes by using more than one 
<emph>internal</emph> representation for a given value, provided no mechanism inherent in 
the datatype implementation (i.e., other than bit-string-preserving &quot;casting&quot; of 
the datum to a different datatype) will distinguish between the two representations.</p>
</note>

<p>In the identity relation defined herein, values
from different <termref def="dt-primitive"/> datatypes&apos; <termref def="dt-value-space">value
spaces</termref> are made artificially distinct if they
might otherwise be considered identical.&nbsp; For example, there is a
number <emph>two</emph> in the <dtref ref="decimal"/>
datatype and a number <emph>two</emph> in the <dtref ref="float"/>
datatype.&nbsp; In the identity relation defined herein, these
two values are considered distinct.&nbsp; Other applications
making use of these datatypes may choose to consider values such as these identical, but for the
view of <termref def="dt-primitive"/> datatypes&apos; <termref def="dt-value-space">value
spaces</termref> used herein, they are distinct.</p>

<p><emph>WARNING:</emph>&nbsp; Care must be taken when identifying values across distinct primitive
datatypes.&nbsp; It turns out that, for example, 0.1 and 0.10000000009 are effectively identical in
<dtref ref="float"/> but not in <dtref ref="decimal"/>.&nbsp; (Neither 0.1 nor 0.10000000009 are in
the <dtref ref="float"/> value space, but <termref role="the" def="dt-lexical-mapping">the lexical mapping</termref>
of <dtref ref="float"/> maps both <string>0.1</string> and <string>0.10000000009</string> to
the same number (0.100000001490116119384765625) that <emph>is</emph> in the <dtref ref="float"/> value space.)</p>

</div3>

<div3 diff="add" dg="fa1" id="equality"><head>Equality</head>

<p>Each <termref def="dt-primitive"></termref> datatype has prescribed an equality relation for its value 
space.&nbsp; The equality relation for most datatypes is the identity relation.&nbsp; In the few cases
where it is not, it has been carefully defined so as to be a <emph>congruence relation</emph> for most
other operations of interest to the datatype.&nbsp; (This means simply that if two values are equal
and one is substituted for the other as an argument to any of the operations, the results will always
also be equal.&nbsp; 
For example, identity is <emph>by definition</emph> a congruence relation for all other operations
of interest.)&nbsp; Equality is always a congruence for the order relation.</p><p>On the other hand,
equality need not cover the entire value space of the 
datatype (though it usually does).</p><p>The equality relation is used in conjunction with
order when making restrictions involving order.&nbsp; This is the only use of
<emph>equality</emph> for schema processing.</p>

<note>
<p>In the prior version of
this specification (1.0), equality was always identity.&nbsp; This has been changed
to permit the datatypes defined herein to more closely match the <unusual>real
world</unusual> datatypes for which  they are intended to be used as transmission formats.</p>

<p>For example, the <dtref ref="float"/> datatype has an equality which is not the 
identity (&nbsp;&minus;0&nbsp;=&nbsp;+0&nbsp;, but they are not identical&mdash;although
they <emph>were</emph> identical in the 1.0 version of this specification), and whose
domain excludes one value, NaN, so that&nbsp; NaN&nbsp;&ne;&nbsp;NaN&nbsp;.</p>

<p>For another example, the <dtref ref="dateTime"/> datatype previously lost any timezone
information in the <termref def="dt-lexical-representation"></termref> as the value was
converted to <phrase diff="del" dg="dt2">timezone
Z</phrase><phrase diff="add" dg="dt2"><termref def="dt-utc"></termref></phrase>;
now the timezone is retained and two values representing the
same <unusual>moment in time</unusual> but with different remembered timezones are now
<emph>equal</emph> but not <emph>identical</emph>.</p>
</note>

<p>In the equality relation defined herein, values
from different primitive data spaces are made artificially unequal even if they might
otherwise be considered equal.&nbsp; For example, there is a number <emph>two</emph> in
the <dtref ref="decimal"/>
datatype and a number <emph>two</emph> in the <dtref ref="float"/> datatype.&nbsp; In the equality
relation defined herein, these two values are considered unequal.&nbsp; Other
applications making use of these datatypes
may choose to consider values such as these equal (and must do so if they choose to consider
them identical); nonetheless, in the equality relation defined herein, they are unequal.</p>

<p>For the purposes of this specification, there is one equality relation for all values
of all datatypes (the union of the various datatype&apos;s individual equalities, if one
consider relations to be sets of ordered pairs).&nbsp; The <emph>equality</emph> relation is denoted 
by <mention>=</mention> and its negation by <mention>&ne;</mention>, each used as a<phrase diff="del" dg="wdd">n</phrase> binary
infix predicate:&nbsp; <var>x</var>&nbsp;=&nbsp;<var>y</var>&nbsp;
and&nbsp; <var>x</var>&nbsp;&ne;&nbsp;<var>y</var>&nbsp;.&nbsp; On 
the other hand, <emph>identity</emph> relationships are always described in words.</p>

</div3>

<div3 diff="add" dg="fa1" id="order"><head>Order</head>

<p>Each datatype has an order relation prescribed.  This order may be a <emph>partial</emph>
order, which means that there may be values in the <termref def="dt-value-space"></termref>
which are neither equal, less-than, nor greater-than.&nbsp; Such value pairs are
<emph>incomparable</emph>.&nbsp; In many cases, the prescribed order is the <unusual>null
order</unusual>:&nbsp; the ultimate partial order, in which no pairs are less-than or
greater-than; they are all equal or <termref def="dt-incomparable"></termref>. 
<termdef term="incomparable" id="dt-incomparable" diff="add" dg="wdd">Two
values that are neither equal, less-than, nor greater-than are 
<term>incomparable</term>.
<phrase diff="add" dg="fa1-fix">Two
values that are not <termref def="dt-incomparable"/> are 
<term>comparable</term>.</phrase></termdef>
The order relation is used in
conjunction with equality when making restrictions involving order.&nbsp; This is the
only use of <emph>order</emph> for schema processing.</p>

<p>In this specification, this less-than order relation is denoted by 
<mention>&lt;</mention> (and its inverse by <mention>&gt;</mention>), the weak order by <mention>&le;</mention> 
(and its inverse by <mention>&ge;</mention>), and the resulting 

<termref def="dt-incomparable"></termref> relation by <mention>&lt;&gt;</mention>, each used as a<phrase diff="del" dg="wdd">n</phrase> binary infix predicate:&nbsp;  
<var>x</var>&nbsp;&lt;&nbsp;<var>y</var>&nbsp;,&nbsp; <var>x</var>&nbsp;&le;&nbsp;<var>y</var>&nbsp;,&nbsp; 
<var>x</var>&nbsp;&gt;&nbsp;<var>y</var>&nbsp;,&nbsp; <var>x</var>&nbsp;&ge;&nbsp;<var>y</var>&nbsp;, 
and&nbsp; <var>x</var>&nbsp;&inc;&nbsp;<var>y</var>&nbsp;.</p>

<note>
<p>The weak order <unusual>less-than-or-equal</unusual> means <unusual>less-than</unusual> or
<unusual>equal</unusual>
<emph>and one can tell which</emph>.&nbsp; For example, the <dtref ref="duration"/> P1M
(one month) is <emph>not</emph> less-than-or-equal P31D (thirty-one
days) because P1M is not less than P31D, nor is P1M equal to P31D.&nbsp; Instead,
P1M is <termref def="dt-incomparable"></termref> with P31D.)&nbsp; The formal definition of order for <dtref ref="duration"/>
(<specref ref="duration"/>) insures that this is true.</p>
</note>

<p>The value spaces of primitive datatypes are abstractions, which may have values in common.&nbsp; In
the order relation defined herein, these value spaces are made artificially <termref def="dt-incomparable"></termref>.&nbsp; For example,
the numbers two and three are values in both the <phrase diff="del" dg="wdd">decimal</phrase><phrase diff="add" dg="wdd">&pD;</phrase> datatype and the float datatype.&nbsp; In the
order relation defined herein, two in the decimal datatype and three in the float datatype are
incomparable values.&nbsp; Other applications making use of these datatypes may choose to consider 
values such as these comparable.</p>

<p>While it is not an error to attempt to compare values from the
value spaces of two different primitive datatypes, they will alway be <termref def="dt-incomparable"></termref> and therefore
unequal:&nbsp; If <var>x</var> and <var>y</var> are in the value spaces of different primitive
datatypes then&nbsp; <var>x</var>&nbsp;&inc;&nbsp;<var>y</var>&nbsp; (and
hence&nbsp; <var>x</var>&nbsp;&ne;&nbsp;<var>y</var>&nbsp;).</p>

</div3>
</div2>

<div2 diff="del" dg="fa1"><head>Lexical space</head>

<p>In addition to its <termref def="dt-value-space"></termref>, each datatype also
has a lexical space.
</p>
<p><termdef term="lexical space" id="del-dt-lexical-space">A
<term>lexical space</term> is the set of valid <emph>literals</emph>
for a datatype.
</termdef></p>
<p>
For example, "100" and "1.0E2" are two different literals from the
<termref def="dt-lexical-space"/> of <dtref ref="float"/> which both
denote the same value. The type system defined in this specification
provides a mechanism for schema designers to control the set of values
and the corresponding set of acceptable literals of those values for
a datatype.
</p>
<note>
<p>
The literals in the <termref def="dt-lexical-space"/>s defined in this specification
have the following characteristics:
</p>
<glist>
<gitem>
<label>
Interoperability:
</label>
<def>
<p>
The number of literals for each value has been kept small; for many
datatypes there is a one-to-one mapping between literals and values.
This makes it easy to exchange the values between different systems.
In many cases, conversion from locale-dependent representations will
be required on both the originator and the recipient side, both for
computer processing and for interaction with humans.
</p>
</def>
</gitem>
<gitem>
<label>
Basic readability:
</label>
<def>
<p>
Textual, rather than binary, literals are used.
This makes hand editing, debugging, and similar activities possible.
</p>
</def>
</gitem>
<gitem>
<label>
Ease of parsing and serializing:
</label>
<def>
<p>
Where possible, literals correspond to those found in common
programming languages and libraries.
</p>
</def>
</gitem>
</glist>
</note><div3 id="del-canonical-lexical-representation">
<head>Canonical Lexical Representation</head>
<p>
While the datatypes defined in this specification have, for the most part,
a single lexical representation i.e. each value in the datatype's
<termref def="dt-value-space"/> is denoted by a single literal in its
<termref def="dt-lexical-space"/>, this is not always the case.&nbsp; The
example in the previous section showed two literals for the datatype
<dtref ref="float"/> which denote the same value.&nbsp; Similarly, there
<termref def="dt-may"/> be
several literals for one of the date or time datatypes that denote the
same value using different timezone indicators.
</p>
<p>
<termdef term="canonical lexical representation" id="del-dt-canonical-representation">A
<term>canonical lexical representation</term>
is a set of literals from among the valid set of literals
for a datatype such that there is a one-to-one mapping between literals
in the <term>canonical lexical representation</term> and
values in the <termref def="dt-value-space"/>.
</termdef>
</p>
</div3></div2>

<div2 id="lexical-space" diff="add" dg="fa1"><head>The Lexical Space and Lexical Mapping</head>

<!--
<p><termdef term="lexical mapping" id="dt-lexical-mapping">A 
<term>lexical mapping</term> for a datatype is a function whose domain is a set of character 
strings and whose range is a subset of the set of values of that datatype.</termdef>  Lexical 
mappings are designated <emph>active</emph> or <emph>inactive</emph>.&nbsp; Two lexical mappings 
active at the same time must have disjoint domains, or at least must agree on the intersection of their domains; this assures that 
<termref role="the" def="dt-lexical-mapping">the (combined) lexical mapping</termref> is a 
function:  it does not map one lexical representation to more than one value.</p>

<p><termdef term="lexical representation" id="dt-lexical-representation">The 
members of the domain of a lexical mapping are <term>lexical representations</term> (under 
that mapping) of the values to which they are mapped.</termdef></p>

<p><termdef term="the lexical mapping" id="dt-the-lexical-mapping"><term><emph>The</emph> 
lexical mapping</term> of a datatype is the union of all active lexical mappings 
for that datatype.</termdef>&nbsp;  The union of the active lexical mappings will necessarily have as 
its range the <termref def="dt-value-space"></termref>.&nbsp; This
assures that each value has at least 
one <termref def="dt-lexical-representation"></termref>.</p>

<p><termdef term="lexical space" id="dt-lexical-space">The
<term>lexical space</term> of a datatype is the  domain of <termref role="the" def="dt-lexical-mapping">the lexical mapping</termref> 
for that datatype.</termdef>&nbsp;  A datatype may have more than
one<termref def="dt-lexical-mapping"></termref>, and more than 
one may be active, subject to the constraints given above.</p>

<p>Should a datatype have <termref def="dt-lexical-mapping">lexical mappings</termref> whose domains overlap 
and which do not give the same value for character strings in the overlap, then there must be a 
fixed algorithm (possibly dependent on facet values) which selects which lexical mappings are active 
(subject to the constraints above); otherwise there <emph>may</emph> be such an algorithm and 
facet(s).&nbsp; In the absence of such an algorithm all of the datatype's mappings are active.</p>
-->

<!--* <ednote><edtext>Some things in this section and elsewhere will need to be rewritten once we decide just how
to deal with context-dependent lexical mappings and lexical spaces.</edtext></ednote> *-->

<p><termdef term="lexical mapping" id="dt-lexical-mapping">The
<term>lexical mapping</term> for a datatype is a prescribed function whose domain is a prescribed set of character 
strings (the <termref def="dt-lexical-space"></termref>) and whose range is the
<termref def="dt-value-space"></termref> of that datatype.</termdef></p>

<p><termdef term="lexical space" id="dt-lexical-space">The
<term>lexical space</term> of a datatype is the prescribed domain of 
<termref role="the" def="dt-lexical-mapping">the lexical mapping</termref> 
for that datatype.</termdef><!-- &nbsp;  A datatype may have more than
one<termref def="dt-lexical-mapping"></termref>, and more than 
one may be active, subject to the constraints given above. --></p>

<p><termdef term="lexical representation" id="dt-lexical-representation">The 
members of the <termref def="dt-lexical-space"></termref> are <term>lexical 
representations</term> of the values to which they are mapped.</termdef></p>

<p>Should a derivation be made using a derivation mechanism that 
removes <termref def="dt-lexical-representation">lexical representations</termref> from
the<termref def="dt-lexical-space"></termref> to the extent that one or more values cease 
to have any <termref def="dt-lexical-representation"></termref>, then those values are
dropped from the <termref def="dt-value-space"></termref>.</p>

<note>
<p>This could happen by means of a <compref ref="f-p"/> facet<!-- or a
<phrase role="UNSURE"><compref ref="NOTATION-facets"/></phrase> facet-->.</p>
</note>

<p>Conversely, should a derivation remove values then their 
<termref def="dt-lexical-representation">lexical representations</termref> are dropped
from the <termref def="dt-lexical-space"></termref> unless there is a facet value whose 
impact is defined to cause the otherwise-dropped <termref def="dt-lexical-representation"></termref>
to be mapped to another value instead.</p>

<note>
<p>There are currently no facets with such an impact.&nbsp; There may be 
in the future.</p>
</note>

<p>For example, &apos;100&apos; and &apos;1.0E2&apos; are two different 
<termref def="dt-lexical-representation">lexical 
representations</termref> from the <dtref ref="float"/> datatype 
which both denote the same value.&nbsp; The datatype 
system defined in this specification provides mechanisms for schema designers
to control the <termref def="dt-value-space"></termref> and the corresponding set of acceptable 
<termref def="dt-lexical-representation">lexical 
representations</termref> of those values for a datatype.</p>

<div3 id="canonical-lexical-representation"><head>Canonical Mapping</head>

<issue id="RQ-129i" role="1.1">
<p><loc href="&reqs;#eliminate-canonical" target="reqs">RQ-129 (remove dependency on canonical representations)</loc></p>
<p>The dependencies are in Part 1; they will be resolved there.&nbsp; Text in this Part will reflect that canonical representation
are provided for the benefit of other users, including other specifications that might want to reference these datatypes.</p>
</issue>

<issue id="RQ-126i" role="1.1">
<p><loc href="&reqs;#restrict-can-forms" target="reqs">RQ-126 (restricting away canonical representations)</loc></p>
<p>Given the "pattern" &cfacet;, restricting away canonical representations cannot be prohibited without undue processing
expense.&nbsp; A warning will be inserted, and RQ-129 will insure that loss of canonical representations will not affect schema processing.</p>
</issue>

<p>While the datatypes defined in this specification generally have
a single <termref def="dt-lexical-representation"></termref> for each value (i.e., each value in the datatype's
<termref def="dt-value-space"></termref> is denoted by a single
<termref def="dt-lexical-representation">representation</termref> in its
<termref def="dt-lexical-space"></termref>), this is not always the case.&nbsp; The
example in the previous section shows two <termref def="dt-lexical-representation">lexical
representations</termref> from the <dtref ref="float"/>
datatype which denote the same value.</p>

<p><termdef id="dt-canonical-mapping" term="canonical mapping">The 
<term>canonical mapping</term> is a prescribed subset of the inverse of a
<termref def="dt-lexical-mapping"></termref> which is 
one-to-one and whose domain (where possible) is the entire range of the
<termref def="dt-lexical-mapping"></termref> (the
<termref def="dt-value-space"></termref>).</termdef>&nbsp; Thus a 
<termref def="dt-canonical-mapping"></termref> selects one
<termref def="dt-lexical-representation"></termref> for each
value in the <termref def="dt-value-space"></termref>.<!-- &nbsp; <phrase role="UNSURE">Most lexical mappings have
an associated canonical mapping; the 
exceptions are a few lexical mappings that are context dependent.</phrase>&nbsp; If
two <termref def="dt-canonical-mapping">canonical mappings</termref> 
with intersecting domains, for a given <termref def="dt-lexical-mapping"></termref>, are
associated with a datatype, then 
there will be a fixed algorithm (possibly dependent on facet values) associated with the 
datatype which resolves any ambiguity of <termref def="dt-canonical-mapping"></termref> in the intersection. --></p>

<p><termdef term="canonical representation" id="dt-canonical-representation">The 
<term>canonical representation</term> of a value in the
<termref def="dt-value-space"></termref> of a datatype is the 
<termref def="dt-lexical-representation"></termref> associated with that value
by the datatype&apos;s <termref def="dt-canonical-mapping"></termref></termdef>.</p>

<!-- <p><termdef id="dt-the-canonical-mapping" term="the canonical mapping"><term><emph>The</emph> 
canonical mapping</term> of a datatype is essentially the union of the
<termref def="dt-canonical-mapping">canonical mappings</termref> 
associated with the active <termref def="dt-lexical-mapping">lexical mappings</termref>,
with values (if any) in the pairwise intersection 
of the domains of those mappings selected according to a fixed algorithm (possibly
having facet values as parameters) associated with the datatype.</termdef></p>
 -->
<p><termref role="the" def="dt-canonical-mapping">Canonical mappings</termref> are not
available for datatypes whose <termref def="dt-lexical-mapping">lexical 
mappings</termref> are context dependent (i.e., mappings for which the value
of a <termref def="dt-lexical-representation"></termref> 
depends on the context in which it occurs, or for which a character string 
may or may not be a valid <termref def="dt-lexical-representation"></termref>
similarly depending on its context)</p><note><p><termref def="dt-canonical-representation">Canonical 
representations</termref> are provided where feasible for the use of other appilications; they are not 
required for schema processing itself.&nbsp; <emph>A conforming schema processor implementation is 
not required to implement <termref def="dt-canonical-mapping">canonical mappings</termref>.</emph></p></note>

</div3>
</div2>

<div2 id="del.facets" diff="del" dg="fa1.z">
<!--* !!! this section was not deleted in the first public working draft.  I think
    * this means fa1 should be split into pre-WD and post-WD bits.  This is a post-WD
    * bit of fa1. -msm *-->
<head>Facets</head>

<issue id="del-RQ-24-1i" role="1.1">
<p><loc href="&reqs;#fundamentals" target="reqs">RQ-24 (systematic approach to facets)</loc></p>
<p>This decision is not yet written up herein:&nbsp; The four informational facets, each of which have only one property,
will be lumped into one facet having four properties.&nbsp; This will represent a further technical change to the
facet structure, but will not result in any additional or lost information in a schema.</p>
</issue>

<p>
<termdef id="del-dt-facet" term="facet">A <term>facet</term> is a single
defining aspect of a <termref def="dt-value-space"/>.&nbsp; Generally
speaking, each facet characterizes a <termref def="dt-value-space"/>
along independent axes or dimensions.</termdef>
</p>
<p diff="del" dg="fpwd-rescinded-del"><!--* !!! This para marked as 
   deleted in first public WD. -msm *-->
The facets of a datatype serve to distinguish those aspects of
one datatype which <emph>differ</emph> from other datatypes.
Rather than being defined solely in terms of a prose description
the datatypes in this specification are defined in terms of
the <emph>synthesis</emph> of facet values which together determine the
<termref def="dt-value-space"/> and properties of the datatype.
</p>
<p diff="del" dg="fpwd-rescinded-del"><!--* !!! This para marked 
   as deleted in first public WD. -msm *-->
Facets are of two types: <emph>fundamental</emph> facets that define
the datatype and <emph>non-fundamental</emph> or <emph>constraining
</emph> facets that constrain the permitted values of a datatype.
</p>

<!--* !!! the following paragraph was marked 'add' in the first public WD
    * and subsequently (re-)deleted.
    * We'd need stronger diff markup than we currently have to make
    * that history clear.  So for the moment I content myself with giving
    * it a unique dg identifier. -->
<p diff="add" dg="fpwd-rescinded-add"><termdef term="facet" id="dt-facet"><term>Facets</term> are designated and named
values that either provide information about an aspect of the datatype (<termref def="dt-fundamental-facet">information
facets</termref>) or control some aspect of the datatype
(<termref def="dt-constraining-facet">&cfacet;s</termref>).</termdef>&nbsp; For example, each datatype has a
<compref ref="dc-cardinality"/> facet whose 
value generally tells something about the finiteness of the datatype, and each datatype has 
a <compref ref="dc-whiteSpace"/>  facet whose value controls the &quot;normalization&quot; of the 
raw data-character string in the XML document undergoes prior to being treated as a potential 
member of the <termref def="dt-lexical-space"></termref>.</p>
				
<!--* !!! the following paragraph was marked 'add' in the first public WD
    * and subsequently (re-)deleted. *-->
<p diff="add" dg="fpwd-rescinded-add">
Facets are of two kinds:&nbsp;
<termdef term="information facet" id="dt-fundamental-facet_rescinded"><term>information facets</term> provide the
application with some information about the datatype</termdef>, and 
<termdef term="&cfacet;" id="dt-constraining-facet_rescinded"><term>&cfacet;</term> values may be set or changed
during derivation (subject to facet-specific controls) 
and which control various aspects of the derived datatype</termdef>.&nbsp; For example, <compref ref="dc-cardinality"/> 
is an information facet and <compref ref="dc-whiteSpace"/> is a &cfacet;.&nbsp; The various information 
facets are described in <specref ref="rf-fund-facets"/> and &cfacet;s in 
<specref ref="rf-facets"/>.</p>

<!--ednote><edtext>We may require that information facets be tracked,
in which case we will change the following note accordingly.&nbsp; Similarly if we don't add the
new &cfacet;s for precisionDecimal or whatever else might need them.</edtext></ednote-->

<!--* !!! the following note was marked 'add' in the first public WD
    * and subsequently (re-)deleted. *-->
<note diff="add" dg="fpwd-rescinded-add">
<p> In the 1.0 version of this specification, information facets were called 
&quot;fundamental facets&quot;<!-- and &cfacet;s were called &quot;constraining 
facets&quot;-->.&nbsp; Information facets are not required for schema processing,
but some applications use them.<!--&nbsp; More &cfacet;s have been added which do 
not constrain the value space of derived datatypes (and the whitespace facet never did).--></p>
</note>


<div3 id="del-fundamental-facets" diff="del" dg="fa1">
<head>Fundamental facets</head>
<p>
<termdef id="del-dt-fundamental-facet" term="fundamental facet">
A <term>fundamental facet</term> is an abstract property which
serves to semantically characterize the values in a
<termref def="dt-value-space"/>.
</termdef>
</p>
<p>
All <term>fundamental facets</term> are fully described in
<specref ref="rf-fund-facets"/>.
</p>
</div3>

<div3 id="del-non-fundamental" diff="del" dg="fa1">
<head>Constraining or Non-fundamental facets</head>
<p>
<termdef id="del-dt-constraining-facet" term="constraining facet">A
<term>constraining facet</term> is an optional property that can be
applied to a datatype to constrain its <termref def="dt-value-space"/>.
</termdef>
</p>
<p>
Constraining the <termref def="dt-value-space"/> consequently constrains
the <termref def="dt-lexical-space"/>.&nbsp; Adding
<termref def="dt-constraining-facet"/>s to a <termref def="dt-basetype"/>
is described in <specref ref="derivation-by-restriction"/>.
</p>
<p>
All <term>constraining facets</term> are fully described in
<specref ref="rf-facets"/>.
</p>

</div3>
</div2>

 <!-- ****************************** END NEW 1.1 MATERIAL (DATATYPES/FACETS) ********************************* -->

<div2 role="1.0" id="datatype-dichotomies" dg="trm1" diff="del">
<head>Datatype dichotomies</head>
<p>
It is useful to categorize the datatypes defined in this specification
along various dimensions, forming a set of characterization dichotomies.
</p>
<div3 role="1.0" id="atomic-vs-list">
<head>Atomic vs. list vs. union datatypes</head>
<p>
The first distinction to be made is that between
<termref def="dt-atomic"/>, <termref def="dt-list"/> and <termref def="dt-union"/>
datatypes.
</p>
<ulist>
<item>
<p><termdef id="dt-atomic" term="atomic"><term>Atomic</term> datatypes
are those having values which are regarded by this specification as
being indivisible.<phrase diff="add" dg="aat">&nbsp; <term>Atomic</term> 
datatypes are <!--* <phrase diff="del" dg="aatj">those derived 
from <dtref ref="anyAtomicType"/></phrase> *-->
<dtref ref="anyAtomicType"/> and all
datatypes derived from it.</phrase></termdef></p></item>
<item>
<p><termdef id="del-dt-list" term="list"><term>List</term>
datatypes are those having values each of which consists of a
finite-length (possibly empty) sequence of values of an
<termref def="dt-atomic"/> datatype.
<phrase dg="aat1" diff="add">&nbsp; <term>List</term> datatypes are those which are explicitly constructed as lists, or are derived from another <term>list</term> datatype.</phrase>
</termdef>
</p>
</item>
<item>
<p>
<termdef id="del-dt-union" term="union"><term>Union</term>
datatypes are those whose <termref def="dt-value-space"/>s and
<termref def="dt-lexical-space"/>s are the union of
the <termref def="dt-value-space"/>s and
<termref def="dt-lexical-space"/>s of one or more other datatypes.<phrase dg="aat1" diff="add">&nbsp; <term>Union</term> datatypes are those which are explicitly constructed as lists, or are derived from another <term>union</term> datatype.</phrase>
</termdef>
</p>
</item>
</ulist>
<p>
For example, a single token which <termref def="dt-match">matches</termref>
<xspecref href="&xmlspec;#NT-Nmtoken">Nmtoken</xspecref> from
<bibref ref="XML"/> could be the value of an <termref def="dt-atomic"/>
datatype (<dtref ref="NMTOKEN"/>); while a sequence of such tokens
could be the value of a <termref def="dt-list"/> datatype
(<dtref ref="NMTOKENS"/>).
</p>

<div4 role="1.0" id="atomic">
<head>Atomic datatypes</head>
<p>
<phrase diff="del" dg="aatf">
<termref def="dt-atomic"/> datatypes can be either
<termref def="dt-primitive"/> or <termref def="dt-derived"/>.&nbsp; The
<termref def="dt-value-space"/> of an <termref def="dt-atomic"/> datatype
is a set of "atomic" values, which for the purposes of this specification,
are not further decomposable.&nbsp; 
</phrase>
<phrase diff="add" dg="aatf">An <termref def="dt-atomic"/> datatype
has a <termref def="dt-value-space"/> consisting of a set of
<unusual>atomic</unusual> values which for purposes of this specification
are not further decomposable.&nbsp;</phrase> 
The <termref def="dt-lexical-space"/> of
an <termref def="dt-atomic"/> datatype is a set of <emph>literals</emph>
whose internal structure is specific to the datatype in question.
<phrase diff="add" dg="aatf">There is one <unusual>special</unusual>
atomic type (<dtref ref="anyAtomicType"/>) and a number of
<termref def="dt-primitive"/> atomic types, which have
<dtref ref="anyAtomicType"/> as their base type.
All other atomic types are derived by restriction either from 
one of the primitive atomic types or from another ordinary atomic 
type.  No user-defined type may have <dtref ref="anyAtomicType"/> 
as its base type.</phrase>
</p>
</div4>

<div4 role="1.0" id="list-datatypes">
<head>List datatypes</head>
<!-- question: are lists ordered? answer should be NO...the sequence
within a single value is ordered, but the value space is a list type
is not ordered
-->
<p>
Several type systems (such as the one described in
<bibref ref="ISO11404"/>) treat <termref def="dt-list"/> datatypes as
special cases of the more general notions of aggregate or collection
datatypes.
</p>
<p>
<termref def="dt-list"/> datatypes are always <termref def="dt-derived"/>.
The <termref def="dt-value-space"/> of a <termref def="dt-list"/>
datatype is a set of finite-length sequences of 
<!--* WG suppresses this 'ordinary', 2005-02-04 *-->
<!--* <phrase diff="add" dg="aatf">ordinary </phrase> *-->
<termref def="dt-atomic"/>
values. The <termref def="dt-lexical-space"/> of a
<termref def="dt-list"/> datatype is a set of literals whose internal
structure is a space-separated
sequence of literals of the
<termref def="dt-atomic"/> datatype of the items in the
<termref def="dt-list"/>.
</p>
<p>
<termdef id="dt-itemType" term="itemType">
The <termref def="dt-atomic"/> or <termref def="dt-union"/>
datatype that participates in the definition of a <termref def="dt-list"/> datatype
is known as the <term>itemType</term> of that <termref def="dt-list"/> datatype.
</termdef>
</p>
<note role="example">
<eg><![CDATA[
<simpleType name='sizes'>
  <list itemType='decimal'/>
</simpleType>
]]></eg>
<eg><![CDATA[
<cerealSizes xsi:type='sizes'> 8 10.5 12 </cerealSizes>
]]></eg>
</note>
<p>
A <termref def="dt-list"/> datatype can be <termref def="dt-derived"/>
from an <phrase diff="add" dg="aatf">ordinary </phrase><termref def="dt-atomic"/> 
datatype whose <termref def="dt-lexical-space"/> allows space
(such as <dtref ref="string"/>
or <dtref ref="anyURI"/>) or a
<termref def="dt-union"/> datatype any of whose 
<propref comp="std" prop="member type definitions"/>'s
<termref def="dt-lexical-space"/> allows space.
In such a case, regardless of the input, list items
will be separated at space boundaries.
</p>

<note role="example">
<eg><![CDATA[
<simpleType name='listOfString'>
  <list itemType='string'/>
</simpleType>
]]></eg>
<eg>
&lt;someElement xsi:type='listOfString'&gt;
this is not list item 1
this is not list item 2
this is not list item 3
&lt;/someElement&gt;
</eg>
<p>
In the above example, the value of the <emph>someElement</emph> element
is not a <termref def="dt-list"/> of <termref def="dt-length"/> 3;
rather, it is a <termref def="dt-list"/> of <termref def="dt-length"/>
18.
</p>
</note>
<!--
     somehow need to get the <has-facets> concept for abstract lists
	 into builtin.xsd, so that the following can be auto-generated
  -->
<p>
When a datatype is <termref def="dt-derived"/> from a
<termref def="dt-list"/> datatype, the following
<termref def="dt-constraining-facet"/>s apply:
</p>
<ulist>
<item><p><termref def="dt-length"/></p></item>
<item><p><termref def="dt-maxLength"/></p></item>
<item><p><termref def="dt-minLength"/></p></item>
<item><p><termref def="dt-enumeration"/></p></item>
<item><p><termref def="dt-pattern"/></p></item>
<item><p><termref def="dt-whiteSpace"/></p></item>
</ulist>
<p>
For each of <termref def="dt-length"/>, <termref def="dt-maxLength"/>
and <termref def="dt-minLength"/>, the <emph>unit of length</emph> is
measured in number of list items.&nbsp; The value of <termref def="dt-whiteSpace"/>
is fixed to the value <emph>collapse</emph>.
</p>
<p>

For <termref def="dt-list"/> datatypes the <termref def="dt-lexical-space"/>

is composed of space-separated
literals of its <termref def="dt-itemType"/>.&nbsp; Hence, any
<termref def="dt-pattern"/> specified when a new datatype is
<termref def="dt-derived"/> from a <termref def="dt-list"/> datatype is matched against
each literal of the <termref def="dt-list"/> datatype and
not against the literals of the datatype that serves as its
<termref def="dt-itemType"/>.

</p>
<note role="example">
<eg>
<![CDATA[<xs:simpleType name='myList'>
	<xs:list itemType='xs:integer'/>
</xs:simpleType>
<xs:simpleType name='myRestrictedList'>
	<xs:restriction base='myList'>
		<xs:pattern value='123 (\d+\s)*456'/>
	</xs:restriction>
</xs:simpleType>
<someElement xsi:type='myRestrictedList'>123 456</someElement>
<someElement xsi:type='myRestrictedList'>123 987 456</someElement>
<someElement xsi:type='myRestrictedList'>123 987 567 456</someElement>
]]>
</eg>
</note>
<p>
The <dtref ref="canonical-lexical-representation"/> for the
<termref def="dt-list"/> datatype is defined as the lexical form in which
each item in the <termref def="dt-list"/> has the canonical lexical
representation of its  <termref def="dt-itemType"/>.
</p>
</div4>

<div4 role="1.0" id="union-datatypes">
<head>Union datatypes</head>
<p>
The <termref def="dt-value-space"/> and <termref def="dt-lexical-space"/>
of a <termref def="dt-union"/> datatype are the union of the
<termref def="dt-value-space"/>s and <termref def="dt-lexical-space"/>s of
its <termref def="dt-memberTypes"/>.
<termref def="dt-union"/> datatypes are always <termref def="dt-derived"/>.
Currently, there are no <termref def="dt-built-in"/>&nbsp;<termref def="dt-union"/>
datatypes.
</p>
<note role="example">
<p>
A prototypical example of a <termref def="dt-union"/> type is the
<xspecref href="&xsdl;#p-max_occurs">maxOccurs attribute</xspecref> on the
<xspecref href="&xsdl;#element-element">element element</xspecref>
in XML Schema itself: it is a union of nonNegativeInteger
and an enumeration with the single member, the string "unbounded", as shown below.
</p>
<eg><![CDATA[
  <attributeGroup name="occurs">
    <attribute name="minOccurs" type="nonNegativeInteger"]]>
    	use="optional"<![CDATA[ default="1"/>
    <attribute name="maxOccurs"]]>use="optional" default="1"<![CDATA[>
      <simpleType>
        <union>
          <simpleType>
            <restriction base='nonNegativeInteger'/>
          </simpleType>
          <simpleType>
            <restriction base='string'>
              <enumeration value='unbounded'/>
            </restriction>
          </simpleType>
        </union>
      </simpleType>
    </attribute>
  </attributeGroup>
]]></eg>
</note>
<p>
Any number (greater than 1) of <phrase diff="add" dg="aatf">ordinary
</phrase><termref def="dt-atomic"/> or <termref def="dt-list"/>
<termref def="dt-datatype"/>s can participate in a <termref def="dt-union"/> type.
</p>
<p>
<termdef id="dt-memberTypes" term="memberTypes">
The datatypes that participate in the
definition of a <termref def="dt-union"/> datatype are known as the
<term>memberTypes</term> of that <termref def="dt-union"/> datatype.
</termdef>
</p>
<p>
The order in which the <termref def="dt-memberTypes"/> are specified in the
definition (that is, the order of the &lt;simpleType&gt; children of the &lt;union&gt;
element, or the order of the <dtref ref="QName"/>s in the <emph>memberTypes</emph>
attribute) is significant.
During validation, an element or attribute's value is validated against the
<termref def="dt-memberTypes"/> in the order in which they appear in the
definition until a match is found.&nbsp; The evaluation order can be overridden
with the use of <xspecref href="&xsdl;#xsi_type">xsi:type</xspecref>.
</p>
<note>
<p>
For example, given the definition below, the first instance of the &lt;size&gt; element
validates correctly as an <specref ref="integer"/>, the second and third as
<specref ref="string"/>.
</p>
<eg><![CDATA[
  <xsd:element name='size'>
    <xsd:simpleType>
      <xsd:union>
        <xsd:simpleType>
          <xsd:restriction base='integer'/>
        </xsd:simpleType>
        <xsd:simpleType>
          <xsd:restriction base='string'/>
        </xsd:simpleType>
      </xsd:union>
    </xsd:simpleType>
  </xsd:element>
]]></eg>
<eg><![CDATA[
  <size>1</size>
  <size>large</size>
  <size xsi:type='xsd:string'>1</size>
]]></eg></note>
<p> The <dtref ref="canonical-lexical-representation"/> for a
<termref def="dt-union"/> datatype is defined as the lexical form in which
the values have the canonical lexical representation
of the appropriate  <termref def="dt-memberTypes"/>.</p>
<note>
<p>
A datatype which is <termref def="dt-atomic"/> in this specification
need not be an <unusual>atomic</unusual> datatype in any programming language used to
implement this specification.&nbsp; Likewise, a datatype which is a
<termref def="dt-list"/> in this specification need not be a "list"
datatype in any programming language used to implement this specification.
Furthermore, a datatype which is a <termref def="dt-union"/> in this
specification need not be a "union" datatype in any programming
language used to implement this specification.
</p>
</note>
</div4>
</div3>
<!--* !!! this was not marked as deleted in WD of July 2004.
    * When fa1 is split, this is post-wd
    *-->
<div3 role="1.0" id="primitive-vs-derived">
<head>Primitive vs. <phrase diff="del" dg="fa1.z">derived datatypes</phrase><phrase diff="add" dg="fa1.z">Constructed Datatypes</phrase></head>
<p>
Next, we distinguish between <termref def="dt-primitive"/><phrase diff="add" dg="fa1.z">, 
<termref def="dt-constructed"/>,</phrase> and
<termref def="dt-derived"/> datatypes.
</p>
<ulist>
<item>
<p><termdef id="dt-primitive" term="primitive"><term>Primitive</term>
datatypes are those that are not defined in terms of other datatypes;
they exist <emph>ab initio</emph>.</termdef></p>
</item>
<item>
<p diff="del" dg="fa1.z">
<termdef id="quondam-dt-derived" term="derived"><term>Derived</term>
datatypes are those that are defined in terms of other datatypes.
</termdef>
</p>
<p diff="add" dg="fa1.z"><termdef id="dt-constructed" term="constructed"><term>Constructed</term>
datatypes are those that are defined in terms of other datatypes.</termdef></p>
</item>
</ulist>
<p>
For example, in this specification, <dtref ref="float"/> is a well-defined
mathematical
<!-- find example other than float -->
concept that cannot be defined in terms of other datatypes, while
a <dtref ref="integer"/> is a special case of the more general datatype
<dtref ref="decimal"/>.
</p>
<issue id="diff-RQ-141i" role="1.1" diff="del" dg="aat">
<p><loc href="&reqs;#anyAtomicType" target="reqs">RQ-141 (add abstract
anyAtomicType)</loc> <loc href="&reqs;#fundamentals" target="reqs">RQ-24 (systematic facets: status and value space of
anySimpleType)</loc></p>
<p>A new <term>special</term> datatype will be introduced as a child
of anySimpleType and the base type of all primitive atomic datatypes.</p>
</issue>
<p>
<termdef id="dt-anySimpleType" term="anySimpleType" role="local">
The <phrase diff="del" dg="aatf">simple ur-type definition</phrase><phrase diff="add" dg="aatf">definition of <dtref ref="anySimpleType"/></phrase>
is a special restriction of 
<phrase diff="del" dg="aatf">the <xtermref href="&xsdl;#key-urType">ur-type definition</xtermref>
whose name is <term>anySimpleType</term> in the XML Schema namespace</phrase><phrase diff="add" dg="aatf"><dtref ref="anyType"/></phrase>.
<phrase diff="del" dg="aatg"><term>anySimpleType</term> can be
considered as the <termref def="dt-basetype"/> of all <termref def="dt-primitive"/>
datatypes.</phrase>
<term>anySimpleType</term> is considered to have an unconstrained lexical space and a
<termref def="dt-value-space"/> consisting of the union of the
<termref def="dt-value-space"/>s of all the
<termref def="dt-primitive"/>
datatypes and the set of all lists of all members of the
<termref def="dt-value-space"/>s of all the
<termref def="dt-primitive"/> datatypes.
</termdef>
</p>
<p>
The datatypes defined by this specification fall into both
the <termref def="dt-primitive"/> and <termref def="dt-derived"/>
categories.&nbsp; It is felt that a judiciously chosen set of
<termref def="dt-primitive"/> datatypes will serve the widest
possible audience by providing a set of convenient datatypes that
can be used as is, as well as providing a rich enough base from
which the variety of datatypes needed by schema designers can be
<termref def="dt-derived"/>.
</p>
<p>
In the example above, <dtref ref="integer"/> is <termref def="dt-derived"/>
from <dtref ref="decimal"/>.
</p>
<note>
<p>
A datatype which is <termref def="dt-primitive"/> in this specification
need not be a "primitive" datatype in any programming language used to
implement this specification.&nbsp; Likewise, a datatype which is
<termref def="dt-derived"/> in this specification need not be a
"derived" datatype in any programming language used to implement
this specification.
</p>
</note>
<p>
As described in more detail in <specref ref="xr-defn"/>,
each <termref def="dt-user-derived"/> datatype <termref def="dt-must"/>
be defined in terms of another datatype in one of three ways: 1) by assigning
<termref def="dt-constraining-facet"/>s which serve to <emph>restrict</emph> the
<termref def="dt-value-space"/> of the <termref def="dt-user-derived"/>
datatype to a subset of that of the <termref def="dt-basetype"/>; 2) by creating
a <termref def="dt-list"/> datatype whose <termref def="dt-value-space"/>
consists of finite-length sequences of values of its
<termref def="dt-itemType"/>; or 3) by creating a <termref def="dt-union"/>
datatype whose <termref def="dt-value-space"/> consists of the union of the
<termref def="dt-value-space"/>s of its <termref def="dt-memberTypes"/>.
</p>

<div4 role="1.0" id="restriction">
<head>Derived by restriction</head>
<p>
<!-- add the exception for pattern -->
<termdef id="dt-restriction" term="restriction">A datatype is said to be
<termref def="dt-derived"/> by <term>restriction</term> from another datatype
when values for zero or more <termref def="dt-constraining-facet"/>s are specified
that serve to constrain its <termref def="dt-value-space"/> and/or its
<termref def="dt-lexical-space"/> to a subset of those of its
<termref def="dt-basetype"/>.
</termdef>
</p>
<p>
<termdef id="dt-basetype" term="base type">Every
datatype that is <termref def="dt-derived"/> by <termref def="dt-restriction"/>
is defined in terms of an existing datatype, referred to as its
<term>base type</term>. <term>base type</term>s can be either
<termref def="dt-primitive"/> or <termref def="dt-derived"/>.
</termdef>
</p>
</div4>

<div4 role="1.0" id="list">
<head>Derived by list</head>
<p>
A <termref def="dt-list"/> datatype can be <termref def="dt-derived"/>
from another datatype (its <termref def="dt-itemType"/>) by creating
a <termref def="dt-value-space"/> that consists of a finite-length sequence
of values of its <termref def="dt-itemType"/>.
</p>
</div4>

<div4 role="1.0" id="union">
<head>Derived by union</head>
<p>
One datatype can be <termref def="dt-derived"/> from one or more
datatypes by <termref def="dt-union"/>ing their <termref def="dt-value-space"/>s
and, consequently, their <termref def="dt-lexical-space"/>s.
</p>
</div4>
</div3>
<div3 role="1.0" id="built-in-vs-user-derived">
<head>Built-in vs. user-derived datatypes</head>
<ulist>
<item>
<p>
<termdef id="dt-built-in" term="built-in"><term>Built-in</term>
datatypes are those which are defined in this specification,
and can be either <termref def="dt-primitive"/> or
<termref def="dt-derived"/>;
</termdef>
</p>
</item>
<item>
<p>
<termdef id="dt-user-derived" term="user-derived">
<term>User-derived</term> datatypes are those <termref def="dt-derived"/>
datatypes that are defined by individual schema designers.
</termdef>
</p>
</item>
</ulist>
<p>
Conceptually there is no difference between the
<termref def="dt-built-in"/>&nbsp;<termref def="dt-derived"/> datatypes
included in this specification and the <termref def="dt-user-derived"/>
datatypes which will be created by individual schema designers.
The <termref def="dt-built-in"/>&nbsp;<termref def="dt-derived"/> datatypes
are those which are believed to be so common that if they were not
defined in this specification many schema designers would end up
"reinventing" them.&nbsp; Furthermore, including these
<termref def="dt-derived"/> datatypes in this specification serves to
demonstrate the mechanics and utility of the datatype generation
facilities of this specification.
</p>
<note>
<p>
A datatype which is <termref def="dt-built-in"/> in this specification
need not be a "built-in" datatype in any programming language used
to implement this specification.&nbsp; Likewise, a datatype which is
<termref def="dt-user-derived"/> in this specification need not
be a "user-derived" datatype in any programming language used to
implement this specification.
</p>
</note>
</div3>
</div2>

<div2 id="dtAndSch" diff="add" dg="trm1">
<head>Datatypes and Schemas</head>

<p>Datatypes as defined above exist, in the abstract, independently of
whether they have any relation to schemas as defined in this
specification.&nbsp; Datatypes are tied to schemas either by explicit
description in this specification, or by user mechanisms prescribed in
this specdification for use in user-created schemas.</p>
<p>The user-usable mechanism prescribed by this specification is the
ability to add additional <compref ref="std"/>s to schemas.&nbsp;
<compref ref="std"/>s and their use within schemas are described in
<specref ref="dc-defn"/>.&nbsp; A <compref ref="std"/> selects a
particular datatype and gives it a name and a place in the
schema&apos;s <phrase role="UNSURE">datatype hierarchy</phrase>, which
is a structuring of all the datatypes associated with a schema.</p>

<div3 id="dtDerivHier">
<head>The Datatype Derivation Hierarchy</head>

<p>Datatypes associated with a schema are organized in a hierarchy
that exactly parallels the datatypes&apos; defining (or selecting)
<compref ref="std"/>s in the  schema&apos;s corresponding <phrase role="UNSURE">schema type hierarchy</phrase>, as described in <specref ref="dc-defn"/>.&nbsp; <termdef id="dt-immediately-derived" term="immediately derived">A datatype is <term>immediately
derived</term> from another 
if <phrase diff="add" dg="iff">and only if</phrase> 
it is immediately below the other
(i.e., away from the root) in the derivation
hierarchy.</termdef>&nbsp; <termdef id="dt-base-type" term="base
type">A datatype is the <term>base type</term> of another 
if <phrase diff="add" dg="iff">and only if</phrase> 
the other
is immediately derived from it.</termdef>&nbsp; <termdef id="add_trm1-dt-derived" term="derived">A datatype is <term>derived</term> from
another 
if <phrase diff="add" dg="iff">and only if</phrase> 
there is a chain of <termref def="dt-immediately-derived"></termref> datatypes beginning with it
and ending with the other.</termdef>&nbsp; It is often easiest to
determine a datatype&apos;s location in the hierarchy by examining the
corresponding <compref ref="std"/> in the <phrase role="UNSURE">schema
type hierarchy</phrase>.</p><p>At the root of the hierarchy are two
special datatypes, <dtref ref="anySimpleType"/> and <dtref ref="anyAtomicType"/>.&nbsp; <dtref ref="anySimpleType"/> is the real
root; <dtref ref="anyAtomicType"/> is <termref def="dt-immediately-derived"></termref> from <dtref ref="anySimpleType"/>.</p><p>All other (<unusual>ordinary</unusual>)
datatypes are <termref def="dt-derived"></termref> from these two
special datatypes.&nbsp; The most important class of datatypes
<termref def="dt-immediately-derived"></termref> from these two are
the primitive datatypes, all of which are described in <specref ref="built-in-primitive-datatypes"/>.&nbsp; Starting with the
primitive datatypes, all other schema-usable datatypes are either
<phrase role="UNSURE">facet-derived</phrase>, <phrase role="UNSURE">constructed as lists</phrase>, or <phrase role="UNSURE">constructed as unions</phrase>.</p>

</div3>

<div3>
<head>Atomic, List, and Union Datatypes</head>

<p>Ordinary datatypes may be characterized as <emph>atomic</emph>,
<emph>list</emph>, or <emph>union</emph> datatypes.</p>
<p><termdef id="add_trm1_dt-atomic" term="atomic">An <term>atomic</term>
datatype is one which is <termref def="dt-derived"></termref> from
<dtref ref="anyAtomicType"/>.</termdef>&nbsp; Since only (and all)
primitive datatypes are <termref def="dt-immediately-derived"></termref> from <dtref ref="anyAtomicType"/>, all other atomic datatypes are <termref def="dt-derived"></termref> from primitives.</p>
<p><termdef id="dt-list" term="list">A <term>list</term>
datatype is one that is constructed to have lists of values from some
other datatype, or any datatype subsequently <termref def="dt-derived"></termref> from a <term>list</term> datatype
</termdef>&nbsp; <termdef id="dt-item-type" term="item type">The other
datatype from which a list datatype is constructed is the list
datatype&apos;s <term>item type</term>.</termdef>&nbsp; Datatypes that
are <phrase role="UNSURE">constructed as lists</phrase> are
<emph><termref def="dt-immediately-derived"></termref></emph>  from
<dtref ref="anySimpleType"/>, so all list datatypes are <emph><termref def="dt-derived"></termref></emph> from <dtref ref="anySimpleType"/>.</p>
<p>&nbsp; <termdef id="dt-union" term="union">A <term>union</term>
datatype is one that is constructed to have the of values from some
other datatypes, or any datatype subsequently <termref def="dt-derived"></termref> from a <term>union</term> datatype
</termdef>&nbsp; <termdef id="dt-union-type" term="union type">The
other datatypes from which a union datatype is constructed are the
union datatype&apos;s <term>member types</term>.</termdef>&nbsp;
Datatypes that are <phrase role="UNSURE">constructed as
unions</phrase> are <emph><termref def="dt-immediately-derived"></termref></emph>  from <dtref ref="anySimpleType"/>, so all union datatypes are <emph><termref def="dt-derived"></termref></emph> from <dtref ref="anySimpleType"/>.&nbsp; </p>
<p>All datatypes in the  <phrase role="UNSURE">datatype
hierarchy</phrase> of a schema that are not <emph><termref def="dt-immediately-derived"></termref></emph>  from <dtref ref="anySimpleType"/> or <dtref ref="anyAtomicType"/> are <phrase role="UNSURE">facet-derived</phrase> from their <termref def="dt-base-type">base types</termref>.&nbsp; The mechanisms of
construction and <phrase role="UNSURE">facet-derivation</phrase> are
described in <specref ref="dc-defn"/>.</p>

</div3><div3>
<head>Placing a Datatype in the Hierarchy</head>

<p>Special and primitive datatypes are placed in the hierarchy by explicit rules in this specification.&nbsp; As mentioned above, <dtref ref="anySimpleType"/> is at the root of the hierarchy, <dtref ref="anyAtomicType"/> is <termref def="dt-immediately-derived"></termref> from <dtref ref="anySimpleType"/>, and all primitive datatypes&mdash;and only primitive datatypes&mdash;are <termref def="dt-immediately-derived"></termref> from <dtref ref="anyAtomicType"/>.</p><p>A constructed datatype (<termref def="dt-list"></termref> or <termref def="dt-union"></termref>) is always <termref def="dt-immediately-derived"></termref> from <dtref ref="anySimpleType"/>.&nbsp; A <phrase role="UNSURE">facet-derived</phrase> datatype is always <termref def="dt-immediately-derived"></termref> from its <termref def="dt-base-type"></termref>.&nbsp; These are the only ways a datatype not special or primitive can be placed in a schema&apos;s <phrase role="UNSURE">datatype hierarchy</phrase>.</p><note><p>The special, primitive, and other ordinary datatypes described in this specification are present in every schema&apos;s <phrase role="UNSURE">datatype hierarchy</phrase>.&nbsp; Any others depend on the schema.</p></note>

</div3>

<div3>
<head>YYY</head>

<p><?xm-replace_text {p}?></p>

</div3>
</div2>
</div1>

<div1 role="1.0" id="built-in-datatypes">
<!--* !!! n.b. newOrg gives this section the id builtinSTDs.
    * For now, I have left this ID unchanged. -msm
    *-->
<head><phrase diff="del" dg="dpno">Built-in datatypes</phrase><phrase diff="add" dg="dpno">Built-in <compref ref="std"/>s and their Datatypes</phrase></head>

<!--* <ednote diff="add" dg="wdd"><edtext>The graphic will be redrawn to show anyAtomicType and any other appropriate changes.</edtext></ednote> *-->

<!--* !!! temporary / experimental change from type-hierarchy.gif to 
    * type-hierarchy.png.  Revert when appropriate.
    *-->
<!--* 
<graphic source="type-hierarchy.gif" alt="Diagram of built-in type hierarchy" map="typeImage"/> 
*-->
<graphic map="built-in-datatype-hierarchy-image-map" id="type-hierarchy-diagram" source="type-hierarchy.png" alt="Diagram of built-in type hierarchy"/>
<!--
	thanx to Asir S Vedamuthu for creating this image map
  -->
<!--*
  <imagemap source="image-map.html" id="typeImage"/>
  <imagemap source="image-map_fullsize.html" id="typeImage"/>
*-->
<imagemap source="built-in-datatype-hierarchy.html" id="built-in-datatype-hierarchy-image-map"/>

<p>
      Each built-in datatype in this specification (both
      <termref def="dt-primitive"/> and
      <termref def="dt-derived"/>) can be uniquely addressed via a
      URI Reference constructed as follows:
</p>
<olist>
<item><p>the base URI is the URI of the XML Schema namespace</p></item>
<item><p>the fragment identifier is the name of the datatype</p></item>
</olist>
<p>
      For example, to address the <dtref ref="int"/> datatype, the URI is:
</p>
<ulist>
      <item><p><code>http://www.w3.org/2001/XMLSchema#int</code></p></item>
</ulist>
<p>
      Additionally, each facet definition element can be uniquely
      addressed via a URI constructed as follows:
</p>
<olist>
<item><p>the base URI is the URI of the XML Schema namespace</p></item>
<item><p>the fragment identifier is the name of the facet</p></item>
</olist>
<p>
      For example, to address the maxInclusive facet, the URI is:
</p>
<ulist>
      <item><p><code>http://www.w3.org/2001/XMLSchema#maxInclusive</code></p></item>
</ulist>
<p>
      Additionally, each facet usage in a built-in datatype definition
      can be uniquely addressed via a URI constructed as follows:
</p>
<olist>
<item><p>the base URI is the URI of the XML Schema namespace</p></item>
<item><p>the fragment identifier is the name of the datatype, followed
	by a period (".") followed by the name of the facet</p></item>
</olist>
<p>
      For example, to address the usage of the maxInclusive facet in
      the definition of int, the URI is:
</p>
<ulist>
      <item><p><code>http://www.w3.org/2001/XMLSchema#int.maxInclusive</code></p></item>
</ulist>

<div2 role="1.0" id="namespaces">
<head>Namespace considerations</head>
<p>
The <termref def="dt-built-in"/> datatypes defined by this specification
are designed to be used with the &schema-language; as well as other
XML specifications.
To facilitate usage within the &schema-language;, the <termref def="dt-built-in"/>
datatypes in this specification have the namespace name:
</p>
<ulist>
<item><p>http://www.w3.org/2001/XMLSchema</p></item>
</ulist>
<p>
To facilitate usage in specifications other than the &schema-language;,
such as those that do not want to know anything about aspects of the
&schema-language; other than the datatypes, each <termref def="dt-built-in"/>
datatype is also defined in the namespace whose URI is:
</p>
<ulist>
<item><p>http://www.w3.org/2001/XMLSchema-datatypes</p></item>
</ulist>
<p>
This applies to both
<termref def="dt-built-in"/>&nbsp;<termref def="dt-primitive"/> and
<termref def="dt-built-in"/>&nbsp;<termref def="dt-derived"/> datatypes.
</p>
<p>
Each <termref def="dt-user-derived"/> datatype is also associated with a
unique namespace.&nbsp; However, <termref def="dt-user-derived"/> datatypes
do not come from the namespace defined by this specification; rather,
they come from the namespace of the schema in which they are defined
(see <xspecref href="&xsdl;#declare-schema">XML Representation of
Schemas</xspecref> in <bibref ref="structural-schemas"/>).
</p>
</div2>


<div2 id="special-datatypes" dg="aat1" diff="add">
<!--* !!! n.b. newOrg gives this section the ID 'specialSTDs'.
    * Since 'STD' has strong and irrelevant connotations for me,
    * stemming from my youth, I have for now left the old ID here
    * and elsewhere.  -msm
    *-->
<head>Special <phrase diff="del" dg="dpno">Datatypes</phrase><phrase diff="add" dg="dpno">Simple Type Definitions</phrase></head>

<p diff="del" dg="dpno">Special datatypes</p> 

<p diff="add" dg="dpno">There are two <emph>special</emph> <compref ref="std"/>s.&nbsp; (All others are <emph>ordinary</emph>.)&nbsp; The
special <compref ref="std"/>s are, unlike the ordinary ones, more
important as <compref ref="std"/>s than as datatypes.</p>

<div3 id="anySimpleType">
<head>anySimpleType</head>
<p>xxx</p>

</div3>

<div3 id="anyAtomicType">
<head>anyAtomicType</head>
<p>xxx</p>

</div3>

</div2>

<div2 role="1.0" id="built-in-primitive-datatypes">
<!--* !!! N.B. newOrg uses the id 'primStdsAndDts'.  For the 
    * moment, I'll stick with the old ID. -msm 2005-01-09 
    *-->
<head>Primitive <phrase diff="add" dg="dpno"><compref ref="std"/>s 
and D</phrase><phrase diff="del" dg="dpno">d</phrase>atatypes</head>
<p>
The <termref def="dt-primitive"/> datatypes defined by this specification
are described below.&nbsp; For each datatype, the
<termref def="dt-value-space"/> and <termref def="dt-lexical-space"/>
are defined, <termref def="dt-constraining-facet"/>s which apply
to the datatype are listed and any datatypes <termref def="dt-derived"/>
from this datatype are specified.
</p>
<p>
<termref def="dt-primitive"/> datatypes can only be added by revisions
to this specification.
</p>

<div3 role="1.0" id="string">
<!--* !!! newOrg replaces 'string' in the following head with a dtref.
    * Similarly the 'term' in the termdef (which it deletes).
    * I'm leaving it alone for now; a single change where
    * this change is applied to all datatypes is better than a
    * piecemeal change.
    *-->
<head>string</head>
<p>
<termdef id="dt-string" term="string" role="local">The <term>string</term> datatype
represents character strings in XML.&nbsp; The <termref def="dt-value-space"/>
of <term>string</term> is the set of finite-length sequences of
<xtermref href="&xmlspec;#dt-character">character</xtermref>s (as defined in
<bibref ref="XML"/>) that <termref def="dt-match"/> the
<xnt href="&xmlspec;#NT-Char">Char</xnt> production from <bibref ref="XML"/>.
A <xtermref href="&xmlspec;#dt-character">character</xtermref> is an atomic unit of
communication; it is not further specified except to note that every
<xtermref href="&xmlspec;#dt-character">character</xtermref> has a corresponding
Universal Character Set code point, which is an integer.
</termdef>
</p>
<note>
<p>
Many human languages have writing systems that require
child elements for control of aspects such as bidirectional formating or
ruby annotation (see <bibref ref="ruby"/> and Section 8.2.4
<xspecref href="&html4;struct/dirlang.html#h-8.2.4">Overriding the
bidirectional algorithm: the BDO element</xspecref> of <bibref ref="html4"/>).
Thus, <term>string</term>, as a simple type that can contain only
characters but not child elements, is often not suitable for representing text.
In such situations, a complex type that allows mixed content should be considered.
For more information, see Section 5.5
<xspecref href="http://www.w3.org/TR/2001/REC-xmlschema-0-20010502/#textType">Any Element, Any Attribute</xspecref>
of <bibref ref="schema-primer"/>.
</p>
</note>
<note>
<p>
As noted in <compref ref="ff-o"/>, the fact that this specification does
not specify an 
<phrase diff="del" dg="dpno"><phrase diff="del" dg="fa1-fix"><termref def="dt-order-relation"/></phrase><phrase diff="add" dg="fa1-fix">order relation</phrase> for 
<termref def="dt-string"/></phrase><phrase diff="add" dg="dpno">order for <dtref ref="string"/></phrase>
does not preclude other applications from treating 
<phrase diff="del" dg="dpno">strings</phrase><phrase diff="add" dg="dpno"><dtref ref="string"/></phrase> as being ordered.
</p>
</note>

<div4 role="1.0" id="string-facets">
<head>Constraining <phrase diff="del" dg="dpno">f</phrase><phrase diff="add" dg="dpno">F</phrase>acets</head>
<facets/>
</div4>

<div4 role="1.0" id="string-derived-types">
<head><phrase diff="del" dg="dpno">Derived datatypes</phrase><phrase diff="add" dg="dpno">Constructed and <termref def="dt-immediately-derived">Immediately Derived</termref> <compref ref="std"/>s</phrase></head>
<!--* Blecch!  Is there some reason this heading has to be unreadable?
    * Why not 'Related types'? *-->
<subtypes/>
</div4>
</div3>

<div3 role="1.0" id="boolean">
<head>boolean</head>
<p>
<termdef id="dt-boolean" term="boolean" role="local"><term>boolean</term> has the
<termref def="dt-value-space"/> required to support the mathematical
concept of binary-valued logic: {true, false}.</termdef>
</p>

<div4 role="1.0" id="boolean-lexical-representation">
<head>Lexical representation</head>
<p>
An instance of a datatype that is defined as <termref def="dt-boolean"/>
can have the following legal literals {true, false, 1, 0}.
</p>
</div4>

<div4 role="1.0" id="boolean-canonical-representation">
<head>Canonical representation</head>
<p>
The canonical representation for <term>boolean</term> is the set of
literals {true, false}.
</p>
</div4>

<div4 role="1.0" id="boolean-facets">
<head>Constraining facets</head>
<facets/>
</div4>
</div3>

<div3 role="1.0" id="decimal">
<head>&Odec;</head>

<issue id="RQ-150i" role="1.1">
<p><loc href="&reqs;#composition" target="reqs">RQ-150 (minimum number of digits for decimal)</loc></p>
<p>The minimum number of digits implementations are required to support
will be lowered to 16 digits; a health warning will be added to note 
that implementations of derived datatypes may support more digits of
precision than the base decimal type does, but that they are not required
to do so.</p>
</issue>

<p>
<termdef id="dt-decimal" term="&odec_;" role="local"><term>&odec;</term>
represents a subset of the real numbers, which can be represented by decimal numerals.
<!--* ah, how about "the subset of the real numbers which can
be represented by finite-length decimal numerals" ? *-->
The <termref def="dt-value-space"/> of <term>&odec;</term>
is the set of numbers that can be obtained by 
<phrase diff="del" dg="rq31m">multiplying</phrase><phrase diff="add" dg="rq31m">dividing</phrase> 
an integer by a non-<phrase diff="del" dg="rq31m">positive</phrase><phrase diff="add" dg="rq31m">negative</phrase>
power of ten, i.e., expressible as 
<phrase diff="del" dg="rq31m"><emph role="eq">i &times; 10^-n</emph></phrase><phrase diff="add" dg="rq31m"><var>i</var>&nbsp;/&nbsp;10<sup><var>n</var></sup></phrase>
where <var>i</var> and <var>n</var> are integers
and 
<phrase diff="del" dg="rq31m"><emph role="eq">n &gt;= 0</emph></phrase><phrase diff="add" dg="rq31m"><var>n</var>&nbsp;&ge;&nbsp;0</phrase>.
Precision is not reflected in this value space;
the number 2.0 is not distinct from the number 2.00.
<phrase diff="add" dg="rq31m">(The datatype <dtref ref="&pD;"/> may be used
for values in which precision is significant.)</phrase>
The <phrase diff="del" dg="fa1-fix"><termref def="dt-order-relation"/></phrase><phrase diff="add" dg="fa1-fix">order relation</phrase> on <term>&odec;</term>
is the order relation on real numbers, restricted
to this subset.
</termdef>
</p>
<note><p>All <termref def="dt-minimally-conforming"/> processors
<termref def="dt-must"/> support &odec; numbers with a minimum of
<phrase diff="del" dg="rq31m">18</phrase><phrase diff="add" dg="rq31m">16</phrase> decimal digits 
(i.e., <phrase diff="del" dg="rq31m">with a 
<termref def="dt-totalDigits"/></phrase> of 18<phrase diff="add" dg="rq31m">they must support all values which would be
allowed by a simple type definition which set
<compref ref="f-td"/> to 16</phrase>).&nbsp; However, <termref def="dt-minimally-conforming"/> processors <termref def="dt-may"/> set
an application-defined limit on the maximum number of decimal digits
they are prepared to support, in which case that application-defined
maximum number <termref def="dt-must"/> be clearly documented.</p>
</note>

<div4 role="1.0" id="decimal-lexical-representation">
<head>Lexical representation</head>
<p>
<term>&odec;</term> has a lexical representation
consisting of a finite-length sequence of decimal digits (#x30-#x39) separated
by a period as a decimal indicator.

An optional leading sign is allowed.
If the sign is omitted, "+" is assumed.&nbsp; Leading and trailing zeroes are optional.
If the fractional part is zero, the period and following zero(es) can
be omitted.
For example: <code>-1.23, 12678967.543233, +100000.00, 210</code>.
</p>
<p diff="add" dg="rq31m"><defset>
<head>The <dtref ref="&odec;"/> Lexical Representation</head>
<prod id="nt-&odec;Rep"><lhs>&odec;LexicalRep</lhs>
<rhs><nt def="nt-decNuml"/>&nbsp;| <nt def="nt-noDecNuml"/></rhs></prod>
</defset></p>

<p diff="add" dg="rq31m">The lexical space of &odec; is the set of
lexical representations which match the grammar given above, or
(equivalently) the regular expression
<string>-?(([0-9]+(.[0-9]*)?)|(.[0-9]+))</string>.
</p>

<p diff="add" dg="rq31m">
The mapping from lexical representations to values is the usual
one for decimal numerals; it is given formally in:
<defsetsum ref="defs-&odec;Lexmap"/>
</p>
</div4>

<div4 role="1.0" id="decimal-canonical-representation">
<head>Canonical representation</head>
<p>
The canonical representation for <term>&odec;</term> is defined by
prohibiting certain options from the
<specref ref="decimal-lexical-representation"/>.&nbsp; Specifically, the preceding
optional "+" sign is prohibited.&nbsp; The decimal point is required. Leading and
trailing zeroes are prohibited subject to the following: there must be at least
one digit to the right and to the left of the decimal point which may be a zero.
</p>
<p diff="add" dg="rq31m">
The mapping from values to canonical representations 
is given formally in:
<defsetsum ref="defs-&odec;Canmap"/>
</p>
</div4>

<div4 role="1.0" id="decimal-facets">
<head>Constraining facets</head>
<facets/>
</div4>

<div4 role="1.0" id="decimal-derived-types">
<head><phrase diff="del" dg="rq31m">Derived datatypes</phrase><phrase diff="add" dg="rq31m">Datatypes based on &odec;</phrase></head>
<subtypes/>
</div4>
</div3>

<div3 id="&pD;" diff="add" dg="pd1"><head>&pD;</head>

<!--* <ednote diff="del" dg="rq31fix">
<edtext>For technical reasons rooted in the editorial
production system, the old primitive decimal type and the two new
decimal types must all have distinct names.  In the current form of
this proposal, the old decimal type is called &ldquo;decimal&rdquo;
(or in some places &ldquo;&odec_;&rdquo;), the new decimal type which
carries information about precision is called &ldquo;&pD;&rdquo;, and
the new decimal type which corresponds most closely to &odec_; is
called &ldquo;&dec;&rdquo;.  
Eventually the editorial system will be changed to allow more than
one of these types to have the same name, but that is not likely for
the foreseeable future.  So the reader should bear in mind that the
names of the types given here are not the final names.
</edtext></ednote> *-->

<!-- satisfied issues disappear -->
<!--* <issue id="RQ-31i" role="1.1" dg="rq31fix" diff="del">
<p><loc href="&reqs;#trailing-zeroes" target="reqs">RQ-31
(precisionDecimal)</loc></p>
<p>This draft describes a new type named (for now) 
&ldquo;precisionDecimal&rdquo;,
which is intended to satisfy requirement RQ-31.&nbsp; It is possible that
this new type will replace the old decimal type.</p>
</issue> *-->

<!--* <issue id="RQ-30i" role="1.1" dg="rq31fix" diff="del">
<p><loc href="&reqs;#negative-scale" target="reqs">RQ-30
(negative fractionDigits for decimal)</loc></p>
<p>The <dtref ref="&pD;"/> type allows negative values for the fractionDigits
facet.</p>
</issue> *-->

<!--* <issue id="RQ-28i" role="1.1" dg="rq31fix" diff="del">
<p><loc href="&reqs;#scientific-notn" target="reqs">RQ-28 (scientific notation for decimal)</loc></p>
<p>The <dtref ref="&pD;"/> type allows exponential notation.</p>
</issue> *-->

<p><termdef id="dt-precisionDecimal" term="&pD;">The <term>&pD;</term>
datatype represents <phrase diff="del" dg="rq31fix">decimal numbers, together 
with their (arithmetic) precision</phrase><phrase diff="add" dg="rq31fix">the
numeric value and (arithmetic) precision of decimal numbers which retain
precision</phrase>; it also 
includes special values for positive and negative infinity and 
<unusual>not a number</unusual>, and it differentiates
between <unusual>positive zero</unusual> and <unusual>negative
zero</unusual>.</termdef>&nbsp; The special values are
introduced to make the datatype correspond closely to 
<phrase diff="del" dg="rq31fix"><phrase role="UNSURE">decimal datatypes 
whose definition is planned for the
next revision of IEEE/ANSI 754</phrase></phrase><phrase diff="add" dg="rq31fix">the
floating-point decimal datatypes described by the forthcoming
revision of IEEE/ANSI 754</phrase>.</p>

<p>Precision is sometimes given in absolute, sometimes in relative
terms.  <termdef id="dt-arithmetic-precision" term="arithmetic
precision">The <term>arithmetic precision</term> of a value is
expressed in absolute quantitative terms,
<phrase diff="add" dg="rq31fix">by </phrase>indicating
how many digits to the right of the decimal point are significant.</termdef>
<quote>5</quote> has an arithmetic precision of 0, and 
<quote>5.01</quote> an arithmetic precision of 2.
</p>

<div4><head>Value Space</head>

<defset><head alt="Properties of &pD; Values">Properties of
<dtref ref="&pD;"/> Values</head>

<vpropdef><name id="vp-pd-numVal">numericalValue</name>
<limits>a &decimal;, <pt>positiveInfinity</pt>,
<pt>negativeInfinity</pt> or <pt>notANumber</pt></limits></vpropdef>

<vpropdef><name id="vp-pd-precision">arithmeticPrecision</name>
<limits>an &integer; or <pt>absent</pt>;
<pt>absent</pt> if and only if <pfref ref="vp-pd-numVal"/> is a <dtref ref="constant"/>.</limits></vpropdef>

<vpropdef><name id="vp-pd-sign">sign</name>
<limits><pt>positive</pt>, <pt>negative</pt>, or <pt>absent</pt>;
must be <pt>positive</pt> if <pfref ref="vp-pd-numVal"/>
is positive or <pt>positiveInfinity</pt>, must be <pt>negative</pt>
if <pfref ref="vp-pd-numVal"/> is negative or <pt>negativeInfinity</pt>,
must be <pt>absent</pt> if and only if <pfref ref="vp-pd-numVal"/> is <pt>notANumber</pt></limits></vpropdef>
</defset>

<note><p>The <pfref ref="vp-pd-sign"/> property is redundant except when <pfref ref="vp-pd-numVal"/>
is zero; in other cases, the <pfref ref="vp-pd-sign"/> value is fully determined by the
<pfref ref="vp-pd-numVal"/> value.<phrase diff="del" dg="rq31fix">&nbsp; 
Code optimization may well make it desirable to separate out the 
<pfref ref="vp-pd-sign"/> and the absolute value of the 
<pfref ref="vp-pd-numVal"/>, which will make implementation easier, 
but the verbal descriptions of such things as equality
and order somewhat more complicated.</phrase></p></note>

<note><p>As explained below, the lexical
representation of the <dtref ref="&pD;"/> value object whose <pfref ref="vp-pd-numVal"/>
is <pt>notANumber</pt> is <string>NaN</string>.&nbsp; Accordingly, in English text we
use <mention>NaN</mention> to refer to that value.&nbsp; Similarly we use <mention>INF</mention>
and <mention>&minus;INF</mention> to refer to the two value objects whose <pfref ref="vp-pd-numVal"/>
is <pt>positiveInfinity</pt> and <pt>negativeInfinity</pt>.&nbsp; These three value objects
are also informally called <quote>not-a-number</quote>, <quote>positive infinity</quote>,
and <quote>negative infinity</quote>.
The latter two together are called
<quote>the infinities</quote>.</p></note>

<p>Equality and order for <dtref ref="&pD;"/> are defined as follows:
<ulist>
<item>
<p>Two numerical <dtref ref="&pD;"/> values
are ordered (or equal) as their
<pfref ref="vp-pd-numVal"/> values are ordered (or equal).&nbsp; 
(This means 
<phrase diff="del" dg="rq31fix">the</phrase><phrase diff="add" dg="rq31fix">that</phrase> 
two zeros with <phrase diff="del" dg="rq31fix">a given 
<pfref ref="vp-pd-precision"/> but</phrase> 
different <pfref ref="vp-pd-sign"/><phrase diff="add" dg="rq31fix">s</phrase> 
are <emph>equal</emph>;
negative zeros are <emph>not</emph> ordered less than positive zeros.)</p></item>
<item diff="del" dg="rq31fix">
<p>A numerical value <var>n</var>
is less than, equal to, or greater than
and a <dtref ref="&pD;"/> value <var>v</var> other than INF, -INF, or NaN
as <var>n</var> is less than, equal to, or greater than
the <pfref ref="vp-pd-numVal"/> of <var>v</var>.
(This comparison is necessary when comparing <dtref ref="&pD;"/>
values to upper and lower bounds.)</p></item>
<item>
<p>INF is equal only to itself, and is greater than
&minus;INF and all numerical <dtref ref="&pD;"/> values.</p></item>
<item>
<p>&minus;INF is equal only to itself, and is less than
INF and all numerical <dtref ref="&pD;"/> values.</p></item>
<item><p>NaN is incomparable with all values, <emph>including
itself</emph>.</p></item>

</ulist>
</p>
</div4>

<div4><head>Lexical Mapping</head>

<p><dtref ref="&pD;"/>'s lexical space is the set of all 
decimal numerals with or without a decimal
point, numerals in scientific (exponential) notation, and
the character strings <string>INF</string>,
<string>+INF</string>, <string>-INF</string>,
and <string>NaN</string>.&nbsp; 
The <compref ref="f-lm"/> 
facet can remove any one or two of the three subsets of 
numerals, with corresponding reductions in
the value space.&nbsp; Using this facet
rather than <compref ref="f-p"/> will change the canonical
mapping to insure that the resulting datatype will still have canonical
representations of all its values.

<defset role="prod"><head>Lexical Space</head>
<prod id="nt-precDecRep">
<lhs>p<phrase diff="del" dg="rq31fix">recision</phrase>DecimalRep</lhs>
<rhs><nt def="nt-noDecNuml"/>&nbsp;| <nt def="nt-decNuml"/>&nbsp;|
<nt def="nt-sciNuml"/>&nbsp;| <nt def="nt-numSpecReps"/></rhs>
</prod>
</defset>
</p>

<p diff="add" dg="rq31fix">The lexical mapping and canonical mapping 
for <dtref ref="&pD;"/> are the following functions:

<defsetsum ref="defs-precDecLexmap"/>
<defsetsum ref="defs-precDecCanmap"/>
</p>
</div4>

<div4>
<head>Simple Type Definition for &pD;</head>
<p>The <compref ref="std"/> of <dtref ref="&pD;"/> is present in every
schema.&nbsp; It has the following properties:</p>

<schemaComp id="pD-def">
<head alt="Simple Type Definition of &pD;"><compref ref="std"/> of 
<dtref ref="duration"/></head>

<pvlist>
<pvpair comp="std" prop="name"><string>&pD;</string></pvpair>
<pvpair comp="std" prop="target namespace"><string>http://www.w3.org/2001/XMLSchema</string></pvpair>
<pvpair comp="std" prop="base type definition">The
<dtref ref="anyAtomicType" role="def"/></pvpair>
<pvpair comp="std" prop="final">The empty set</pvpair>
<pvpair comp="std" prop="variety"><pt>atomic</pt></pvpair>
<pvpair comp="std" prop="primitive type definition"><dtref ref="&pD;"/></pvpair>
<pvpair comp="std" prop="facets">{<ulist>
<item><p>a <compref ref="f-w"/> facet with 
<propref comp="f-w" prop="value"/> = <pt>collapse</pt> and
<propref comp="f-w" prop="fixed"/> = <pt>true</pt>
</p>
</item>
<item>
<p>a <compref ref="f-lm"/> facet with the value 
<pt>{nodecimal, decimal, scientific}</pt></p>
</item>
</ulist>}
</pvpair>
<pvpair comp="std" prop="fundamental facets"><p>{<ulist>
<item><p>an <compref ref="ff-o"/> facet
with <propref comp="ff-o" prop="value"/> = <pt>total</pt></p></item>
<item><p>a <compref ref="ff-b"/> facet
with <propref comp="ff-b" prop="value"/> = <pt>false</pt></p></item>
<item><p>a <compref ref="ff-c"/> facet
with <propref comp="ff-c" prop="value"/> = <pt>countable</pt></p></item>
<item><p>a <compref ref="ff-n"/> facet
with <propref comp="ff-n" prop="value"/> = <pt>true</pt></p></item>
</ulist>}</p>
</pvpair>
<pvpair comp="std" prop="scope"><pt>global</pt></pvpair>
<pvpair comp="std" prop="item type definition"><pt>absent</pt></pvpair>
<pvpair comp="std" prop="member type definitions"><pt>absent</pt></pvpair>
<pvpair comp="std" prop="annotations">The empty sequence</pvpair>
</pvlist>
</schemaComp>
</div4>

<div4>
<head>&CFacet;s</head>
<facets/>
</div4>
</div3>

<div3 role="1.0" id="float">
<head>float</head>

<issue id="RQ-1i" role="1.1">
<p><loc href="&reqs;#canonical-float" target="reqs">RQ-1 (canonical representation of float, double)</loc></p>
<p>The description of canonical representations for float and double needs to be cleaned up.</p>
</issue>

<issue id="RQ-140i" role="1.1">
<p><loc href="&reqs;#negative-positive-zero" target="reqs">RQ-140 (positive and negative zero in float and double)</loc></p>
<p>Two zeros will be provided similar to those in precisionDecimal</p>
</issue>

<p>
<termdef id="dt-float" term="float" role="local"><term>float</term>
is patterned after the IEEE single-precision 32-bit floating point type
<bibref ref="ieee754"/>.&nbsp; The basic <termref def="dt-value-space"/> of
<term>float</term> consists of the values
<emph role="eq">m &times; 2^e</emph>, where <emph role="eq">m</emph>
is an integer whose absolute value is less than
<emph role="eq">2^24</emph>, and <emph role="eq">e</emph> is an integer
between -149 and 104, inclusive.&nbsp; In addition to the basic
<termref def="dt-value-space"/> described above, the
<termref def="dt-value-space"/> of <term>float</term> also contains the
following
three
<emph>special values</emph>:
positive and negative infinity and not-a-number
(NaN).
The <phrase diff="del" dg="fa1-fix"><termref def="dt-order-relation"/></phrase><phrase diff="add" dg="fa1-fix">order relation</phrase> on <term>float</term>
is: <emph role="eq">x &lt; y iff y - x</emph> is positive
for x and y in the value space.
Positive infinity is greater than all other non-NaN values.
NaN equals itself but is incomparable with (neither greater than nor less than)
any other value in the <termref def="dt-value-space"/>.

</termdef>
</p>

<note>
<p>

"Equality" in this Recommendation is defined to be "identity" (i.e., values that
are identical in the <termref def="dt-value-space"/> are equal and vice versa).
Identity must be used for the few operations that are defined in this Recommendation.
Applications using any of the datatypes defined in this Recommendation may use different
definitions of equality for computational purposes; <bibref ref="ieee754"/>-based computation systems
are examples. Nothing in this Recommendation should be construed as requiring that
such applications use identity as their equality relationship when computing.

</p>

<p>

Any value incomparable with the value used for the four bounding facets
(<termref def="dt-minInclusive"/>, <termref def="dt-maxInclusive"/>,
<termref def="dt-minExclusive"/>, and <termref def="dt-maxExclusive"/>) will be
excluded from the resulting restricted <termref def="dt-value-space"/>. In particular,
when "NaN" is used as a facet value for a bounding facet, since no other
<term>float</term> values are 
<termref def="dt-incomparable">comparable</termref> with it, 
the result is a <termref def="dt-value-space"/>
either having NaN as its only member (the inclusive cases) or that is empty
(the exclusive cases). If any other value is used for a bounding facet,
NaN will be excluded from the resulting restricted <termref def="dt-value-space"/>;
to add NaN back in requires union with the NaN-only space.

</p>

<p>

This datatype differs from that of <bibref ref="ieee754"/> in that there is only one
NaN and only one zero. This makes the equality and ordering of values in the data
space differ from that of <bibref ref="ieee754"/> only in that for schema purposes NaN = NaN.

</p>
</note>

<p>
A literal in the <termref def="dt-lexical-space"/> representing a
decimal number <emph role="eq">d</emph> maps to the normalized value
in the <termref def="dt-value-space"/> of <term>float</term> that is
closest to <emph role="eq">d</emph> in the sense defined by
<bibref ref="clinger1990"/>; if <emph role="eq">d</emph> is
exactly halfway between two such values then the even value is chosen.
</p>

<div4 role="1.0" id="float-lexical-representation">
<head>Lexical representation</head>
<p>
<term>float</term> values have a lexical representation
consisting of a mantissa followed, optionally, by the character
<string>E</string> or <string>e</string>, 
followed by an exponent.&nbsp; The exponent <termref def="dt-must"/>
be an <dtref ref="integer"/>.&nbsp; The mantissa must be a 
<dtref ref="decimal"/> number.&nbsp; The representations
for exponent and mantissa must follow the lexical rules for
<dtref ref="integer"/> and <dtref ref="decimal"/>.&nbsp; If the 
<string>E</string> or <string>e</string> and
the following exponent are omitted, an exponent value of 0 is assumed.
</p>
<p>
The <emph>special values</emph>
positive
and negative infinity and not-a-number have lexical representations
<code>INF</code>, <code>-INF</code> and
<code>NaN</code>, respectively.
Lexical representations for zero may take a positive or negative sign.
</p>
<p>
For example, <code>-1E4, 1267.43233E12, 12.78e-2, 12</code>
<code>, -0, 0</code>
and <code>INF</code> are all legal literals for <term>float</term>.
</p>
</div4>

<div4 role="1.0" id="float-canonical-representation">
<head>Canonical representation</head>
<p diff="del" dg="rq001">
The canonical representation for <term>float</term> is defined by
prohibiting certain options from the
<specref ref="float-lexical-representation"/>.&nbsp; Specifically, the exponent
must be indicated by "E".&nbsp; Leading zeroes and the preceding optional "+" sign
are prohibited in the exponent.
If the exponent is zero, it must be indicated by "E0".
For the mantissa, the preceding optional "+" sign is prohibited
and the decimal point is required.
Leading and trailing zeroes are prohibited subject to the following:
number representations must
be normalized such that there is a single digit
which is non-zero
to the left of the decimal point and at least a single digit to the
right of the decimal point
unless the value being represented is zero. The canonical
representation for zero is 0.0E0.
</p>

<!--* 2005-02-03 MSM sighs at the use of "mantissa" in the following
paragraph, which he believes is not quite right. The dictionary of
mathematics at mathworld.wolfram.com says

    For a real number x, the mantissa is defined as the positive
    fractional part x-\left\lfloor{x}\right\rfloor ={\tt frac(x)}, 
    where \left\lfloor{x}\right\rfloor denotes the floor function. 

But it's what the WG approved.  If an editorial proposal is made
to change it, change the other occurrences, too (e.g. in double).
*-->
<p diff="add" dg="rq001">
NaN has the canonical form <string>NaN</string>.&nbsp; Infinity and
negative infinity have the canonical forms <string>INF</string> and
<string>-INF</string> respectively.&nbsp; Besides these special
values, the general form of the canonical form for float
is a mantissa, which is a decimal, followed by <string>E</string>
followed by an exponent which is an integer.&nbsp; Leading zeroes and
the preceding optional <string>+</string> sign are prohibited in the
exponent.&nbsp; If the exponent is zero it must be indicated by
<string>E0</string>.&nbsp; For the mantissa, the preceding optional
<string>+</string> sign is prohibited and the decimal point is
required.&nbsp; Leading and trailing zeroes are prohibited subject to
the following:  number representations must be normalized such that
there is a single digit which is non-zero to the left of the decimal
point and at least a single digit to the right of the decimal point
unless the value being represented is zero.  The canonical form of
positive zero is 0.0E0.&nbsp; The canonical form for negative zero
is -0.0E0.&nbsp; Beyond the one required digit after the decimal point
in the mantissa, there must be as many, but only as many, additional
digits as are needed to uniquely distinguish the value from all other
values for the datatype after rounding.
</p>
</div4>

<div4 role="1.0" id="float-facets">
<head>Constraining facets</head>
<facets/>
</div4>
</div3>

<div3 role="1.0" id="double">
<head>double</head>
<p>
<termdef id="dt-double" term="double" role="local">The <term>double</term>
datatype
is patterned after the

IEEE double-precision 64-bit floating point
type <bibref ref="ieee754"/>.&nbsp; The basic <termref def="dt-value-space"/>
of <term>double</term> consists of the values
<emph role="eq">m &times; 2^e</emph>, where <emph role="eq">m</emph>
is an integer whose absolute value is less than
<emph role="eq">2^53</emph>, and <emph role="eq">e</emph> is an
integer between -1075 and 970, inclusive.&nbsp; In addition to the basic
<termref def="dt-value-space"/> described above, the
<termref def="dt-value-space"/> of <term>double</term> also contains
the following
three
<emph>special values</emph>:

positive and negative infinity and not-a-number
(NaN).
The <phrase diff="del" dg="fa1-fix"><termref def="dt-order-relation"/></phrase><phrase diff="add" dg="fa1-fix">order relation</phrase> on <term>double</term>
is: <emph role="eq">x &lt; y iff y - x</emph> is positive
for x and y in the value space.
Positive infinity is greater than all other non-NaN values.
NaN equals itself but is incomparable with (neither greater than nor less than)
any other value in the <termref def="dt-value-space"/>.

</termdef>
</p>

<note>
<p>

"Equality" in this Recommendation is defined to be "identity" (i.e., values that
are identical in the <termref def="dt-value-space"/> are equal and vice versa).
Identity must be used for the few operations that are defined in this Recommendation.
Applications using any of the datatypes defined in this Recommendation may use different
definitions of equality for computational purposes; <bibref ref="ieee754"/>-based computation systems
are examples. Nothing in this Recommendation should be construed as requiring that
such applications use identity as their equality relationship when computing.

</p>

<p>

Any value incomparable with the value used for the four bounding facets
(<termref def="dt-minInclusive"/>, <termref def="dt-maxInclusive"/>,
<termref def="dt-minExclusive"/>, and <termref def="dt-maxExclusive"/>) will be
excluded from the resulting restricted <termref def="dt-value-s