W3C

XML Schema Part 0: Primer

W3C Working Draft, 7 April 2000

This version:
http://www.w3.org/TR/2000/WD-xmlschema-0-20000407/
Latest version:
http://www.w3.org/TR/xmlschema-0/
Previous version:
http://www.w3.org/TR/2000/WD-xmlschema-0-20000225/
Editor:
David C. Fallside (IBM) fallside@us.ibm.com

Abstract

XML Schema Part 0: Primer is a non-normative document intended to provide an easily readable description of the XML Schema facilities and is oriented towards quickly understanding how to create schemas using the XML Schema language. XML Schema Part 1: Structures and XML Schema Part 2: Datatypes provide the complete normative description of the XML Schema definition language, and the primer describes the language features through numerous examples which are complemented by extensive references to the normative texts.

Status of this Document

The XML Schema Part 0: Primer is a part of the W3C XML Activity.

This is a public working draft of XML Schema 1.0 for review by the public and by members of the World Wide Web Consortium. The XML Schema Working Group has agreed to its publication. Note that some sections of this draft may not be up-to-date with the XML Schema language described in Parts 1 and 2 of the XML Schema specification. Known discrepancies are noted in the text.

The Working Group does not anticipate further substantial changes to the syntax described here, although this is still a working draft, and is subject to change based on experience and on comment by the public, and other W3C working groups.

A list of current W3C working drafts can be found at http://www.w3.org/TR/. They may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress".

Table of contents

1 Introduction

2 Basic Concepts: The Purchase Order
2.1 The Purchase Order Schema
2.2 Complex Type Definitions, Element & Attribute Declarations
2.3 Simple Types
2.4 Anonymous Type Definitions
2.5 Element Content
2.5.1 Complex Types from Simple Types
2.5.2 Empty Content
2.5.3 Mixed Content
2.5.4 Default Content
2.6 Annotations
2.7 Building Content Models
2.8 Attribute Groups
2.9 Null Values

3. Advanced Concepts I: Namespaces, Schemas & Qualification
3.1 Target Namespaces & Unqualified Locals
3.2 Qualified Locals
3.3 Global vs. Local Declarations
3.4 Undeclared Target Namespaces

4. Advanced Concepts II: The International Purchase Order
4.1 A Schema in Multiple Documents
4.2 Deriving Types by Extension
4.3 Using Derived Types in Instance Documents
4.4 Deriving Complex Types by Restriction
4.5 Equivalence Classes
4.6 Abstract Elements and Types
4.7 Preventing the Creation and Use of Derived Types

5. Advanced Concepts III: The Quarterly Report
5.1 Specifying Uniqueness
5.2 Defining Keys and their References
5.3 XML Schema Constraints vs. XML 1.0 ID Attributes
5.4 Importing Types
5.5 Any Element, Any Attribute
5.6 schemaLocation
5.7 Conformance

Appendices

A. Acknowledgements
B. Simple Types & Their Facets
C. Regular Expressions
D. Index
E. Document History


1 Introduction

This document, XML Schema Part 0: Primer, provides an easily approachable description of the XML Schema definition language, and should be used alongside the formal descriptions of the language contained in Parts 1 and 2 of the XML Schema specification. The intended audience of this document includes application developers whose programs read and write schema documents, and schema authors who need to know about the features of the language, especially features that provide functionality above and beyond what is provided by DTDs. The text assumes that you have a basic understanding of XML 1.0 and XML-Namespaces. Each major section of the primer introduces new features of the language, and describes the features in the context of concrete examples.

Section 2 covers the basic mechanisms of XML Schema. It describes how to declare the elements and attributes that appear in XML documents, the distinctions between simple and complex types, defining complex types, the use of simple types for element and attribute values, schema annotation, a simple mechanism for re-using element and attribute definitions, and null values.

Section 3, the first advanced section in the primer, explains the basics of how namespaces are used in XML and schema documents. This section is important for understanding many of the topics that appear in the other advanced sections.

Section 4, the second advanced section in the primer, describes mechanisms for deriving types from existing types, and for controlling these derivations. The section also describes mechanisms for merging together fragments of a schema from multiple sources, and for element substitution.

Section 5 covers more advanced features, including a mechanism for specifying uniqueness among attributes and elements, a mechanism for using types across namespaces, a mechanism for extending types based on namespaces, and a description of how documents are checked for conformance.

In addition to the sections just described, the primer contains a number of appendices that provide detailed reference information on simple types and an associated regular expression language.

The primer is a non-normative document, which means that it does not provide a definitive (from the W3C's point of view) specification of the XML Schema language. The examples and other explanatory material in this document are provided to help you understand XML Schema, but they may not always provide definitive answers. In such cases, you will need to refer to the XML Schema specification, and to help you do this, we provide many links pointing to the relevant parts of the specification. More specifically, XML Schema items mentioned in the primer text are linked to an index of element names and attributes, and a summary table of datatypes, both in the primer. The table and the index contain links to the relevant sections of XML Schema parts 1 and 2.

2 Basic Concepts: The Purchase Order

The purpose of a schema is to define a class of XML documents, and so the term "instance document" is often used to describe an XML document that conforms to a particular schema. In fact, neither instances nor schemas need to exist as documents per se -- they may exist as streams of bytes sent between applications, as fields in a database record, or as collections of XML Infoset "Information Items" -- but to simplify the primer, we have chosen to always refer to instances and schemas as if they are files.

Let us start by considering an instance document in a file called po.xml. It describes a purchase order generated by a home products ordering and billing application:

The Purchase Order, po.xml
<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
    <shipTo country="US">
        <name>Alice Smith</name>
        <street>123 Maple Street</street>
        <city>Mill Valley</city>
        <state>CA</state>
        <zip>90952</zip>
    </shipTo>
    <billTo country="US">
        <name>Robert Smith</name>
        <street>8 Oak Avenue</street>
        <city>Old Town</city>
        <state>PA</state>
        <zip>95819</zip>
    </billTo>
    <comment>Hurry, my lawn is going wild!</comment>
    <items>
        <item partNum="872-AA">
            <productName>Lawnmower</productName>
            <quantity>1</quantity>
            <price>148.95</price>
            <comment>Confirm this is electric</comment>
        </item>
        <item partNum="926-AA">
            <productName>Baby Monitor</productName>
            <quantity>1</quantity>
            <price>39.98</price>
            <shipDate>1999-05-21</shipDate>
        </item>
    </items>
</purchaseOrder>

The purchase order consists of a main element, purchaseOrder, and the subelements shipTo, billTo, and items. These subelements in turn contain other subelements, and so on, until a subelement such as price contains a number rather than any subelements. Elements that contain subelements or carry attributes are said to have complex types, whereas elements that contain numbers (and strings, and dates, etc) but do not contain any subelements are said to have simple types. Some elements have attributes; attributes always have simple types.

The complex types in the instance document, and some of the simple types, are defined in the schema for purchase orders. The other simple types are defined as part of XML Schema's repertoire of built-in simple types.

Before going on to examine the purchase order schema, we digress briefly to mention the association between the instance document and the purchase order schema. As you can see by inspecting the instance document, the purchase order schema is not mentioned. An instance is not actually required to reference a schema, and although many will, we have chosen to keep this first section simple, and to assume that any processor of the instance document can obtain the purchase order schema without any information from the instance document. In later sections, we will introduce explicit mechanisms for associating instances and schemas.

2.1 The Purchase Order Schema

The purchase order schema is contained in the file po.xsd:

The Purchase Order Schema, po.xsd
<xsd:schema xmlns:xsd="http://www.w3.org/1999/XMLSchema">

 <xsd:annotation>
  <xsd:documentation>
   Purchase order schema for Example.com.
   Copyright 2000 Example.com. All rights reserved.
  </xsd:documentation>
 </xsd:annotation>

 <xsd:element name="purchaseOrder" type="PurchaseOrderType"/>

 <xsd:element name="comment" type="xsd:string"/>

 <xsd:complexType name="PurchaseOrderType">
  <xsd:element name="shipTo" type="Address"/>
  <xsd:element name="billTo" type="Address"/>
  <xsd:element ref="comment" minOccurs="0"/>
  <xsd:element name="items"  type="Items"/>
  <xsd:attribute name="orderDate" type="xsd:date"/>
 </xsd:complexType>

 <xsd:complexType name="Address">
  <xsd:element name="name"   type="xsd:string"/>
  <xsd:element name="street" type="xsd:string"/>
  <xsd:element name="city"   type="xsd:string"/>
  <xsd:element name="state"  type="xsd:string"/>
  <xsd:element name="zip"    type="xsd:decimal"/>
  <xsd:attribute name="country" type="xsd:NMTOKEN"
	   use="fixed" value="US"/>
 </xsd:complexType>

 <xsd:complexType name="Items">
  <xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
   <xsd:complexType>
    <xsd:element name="productName" type="xsd:string"/>
    <xsd:element name="quantity">
     <xsd:simpleType base="xsd:positiveInteger">
      <xsd:maxExclusive value="100"/>
     </xsd:simpleType>
    </xsd:element>
    <xsd:element name="price"    type="xsd:decimal"/>
    <xsd:element ref="comment"   minOccurs="0"/>
    <xsd:element name="shipDate" type="xsd:date" minOccurs='0'/>
   <xsd:attribute name="partNum" type="Sku"/>
   </xsd:complexType>
  </xsd:element>
 </xsd:complexType>

<xsd:simpleType name="Sku" base="xsd:string">
  <xsd:pattern value="\d{3}-[A-Z]{2}"/>
 </xsd:simpleType>

</xsd:schema>

The purchase order schema consists of a schema element and a variety of subelements, most notably element, complexType, and simpleType which determine the appearance of elements and their content in instance documents.

Each of the elements in the schema has a prefix xsd: which is associated with the XML Schema namespace through the declaration, xmlns:xsd="http://www.w3.org/1999/XMLSchema", that appears in the schema element. The prefix xsd: is used by convention to denote the XML Schema namespace, although any prefix can be used. The same prefix, and hence the same association, also appears on the names of built-in simple types, e.g. xsd:string. The purpose of the association is to identify the elements and simple types as belonging to the vocabulary of the XML Schema language rather than the vocabulary of the schema author. For the sake of clarity in the text, we just mention the names of elements and simple types (e.g. simpleType), and omit the prefix.

2.2 Complex Type Definitions, Element & Attribute Declarations

In XML Schema, there is a basic difference between complex types which allow elements in their content and may carry attributes, and simple types which cannot have element content and cannot carry attributes. There is also a major distinction between definitions which create new types (both simple and complex), and declarations which enable the appearance in document instances of elements or attributes with specific names and types (both simple and complex). In this section, we focus on defining complex types and declaring the elements and attributes that appear within them.

New complex types are defined using the complexType element and such definitions typically contain a set of element declarations, element references, and attribute declarations. The declarations are not themselves types, but rather an association between a name and constraints which govern the appearance of that name in documents governed by the associated schema. Elements are declared using the element element, and attributes are declared using the attribute element. For example, Address is defined as a complex type, and within the definition of Address we see five element declarations and one attribute declaration:

Defining the Address Type
<xsd:complexType name="Address" >
    <xsd:element name="name"   type="xsd:string" />
    <xsd:element name="street" type="xsd:string" />
    <xsd:element name="city"   type="xsd:string" />
    <xsd:element name="state"  type="xsd:string" />
    <xsd:element name="zip"    type="xsd:decimal" />
    <xsd:attribute name="country" type="xsd:NMTOKEN"
		   use="fixed" value="US"/>
</xsd:complexType>

The consequence of this definition is that any element appearing in an instance whose type is declared to be Address (e.g. shipTo in po.xml) must consist of five elements and one attribute. These elements must be called name, street, city, state and zip as specified by the values of the declarations' name attributes. The first four of these elements will each contain a string, and the fifth will contain a decimal number. The element whose type is declared to be Address may appear with an attribute called country which must contain the string US.

The Address definition contains only declarations involving simple types: string, decimal and NMTOKEN. In contrast, the purchaseOrderType definition contains element declarations involving complex types, e.g. Address, although note that both declarations use the same type attribute to identify the type, regardless of whether the type is simple or complex.

Defining PurchaseOrderType
<xsd:complexType name="PurchaseOrderType">
    <xsd:element   name="shipTo"    type="Address" />
    <xsd:element   name="billTo"    type="Address" />
    <xsd:element   ref="comment"    minOccurs="0" />
    <xsd:element   name="items"     type="Items" />
    <xsd:attribute name="orderDate" type="xsd:date" />
</xsd:complexType>

In defining PurchaseOrderType, two of the element declarations, for shipTo and billTo, associate different element names with the same complex type, namely Address. The consequence of this definition is that any element appearing in an instance (e.g. po.xml) whose type is declared to be PurchaseOrderType must consist of elements named shipTo and billTo, each containing the five subelements (name, street, city, state and zip) that were declared as part of Address. The shipTo and billTo elements may also carry the country attribute that was declared as part of Address.

The PurchaseOrderType definition contains an orderDate attribute declaration which, like the country attribute declaration, identifies a simple type. In fact, all attribute declarations must reference simple types because, unlike element declarations, attributes cannot contain other elements or other attributes.

The element declarations we have described so far have each associated a name with an existing type definition. Sometimes it is preferable to use an existing element rather than declare a new element, for example:

<xsd:element ref="comment" minOccurs="0" />

This declaration references an existing element, comment, that was declared elsewhere in the purchase order schema. In general, the value of the ref attribute must reference a global element, i.e. one that has been declared under schema rather than as part of a complex type definition. The consequence of this declaration is that an element called comment may appear in an instance document, and its content must be consistent with that element's type, in this case, string.

Both elements and attributes may be declared globally. comment is one example of a global element which we reference from an element declaration contained in the PurchaseOrderType definition. We could similarly declare attributes under schema, and reference them using the ref attribute from attribute declarations contained in type definitions.

The comment element is optional within PurchaseOrderType because the value of the minOccurs attribute in its declaration is 0. An element is required to appear when the value of minOccurs is 1. The maximum number of times an element may appear is determined by the value of a maxOccurs attribute in its declaration. This may be a positive integer value such as 41, or the term unbounded to indicate there is no maximum number of occurrences. The default value for minOccurs is 1, but there is no default value for maxOccurs per se: When an element is declared without a maxOccurs attribute, the maximum number of the element's occurrences is equal to the value of the minOccurs attribute. If this value is also omitted, the element must appear exactly once.

Attributes may appear once or not at all (the default), and so the syntax for specifying occurrences of attributes is different than the syntax for elements. In particular, a use attribute is used in an attribute declaration to indicate whether the attribute is required or optional, and if optional whether the attribute's value is fixed or whether there is a default. A second attribute, value, provides any value that is called for. To illustrate, po.xsd contains a declaration for the country attribute, which is declared with use and value values of fixed and US respectively. This declaration means that the appearance of a country attribute is optional, although its value must be US if it does appear, and if it does not appear, a schema processor will create a country attribute with this value.

The values of the attributes used in element and attribute declarations to constrain the occurrences of elements and attributes are summarised in Table 1.

Table 1. Occurrence Constraints for Elements and Attributes
Elements
(minOccurs, maxOccurs) fixed, default 
Attributes
use, value
Notes
(1, 1) -, - required, - element/attribute must appear once, it may have any value
(1, 1) 37, - required, 37 element/attribute must appear once, its value must be 37
(2, unbounded) 37, - n/a element must appear twice or more, its value must be 37; in general, minOccurs and maxOccurs' values may be positive integers, and maxOccurs' value may also be "unbounded"
(0, 1) -, - optional element/attribute may appear once, it may have any value
(0, 1) 37, - fixed, 37 element/attribute may appear once, if it does appear its value must be 37
(0, 1) -, 37 default, 37 element/attribute may appear once; if it does not appear its value is 37, otherwise its value is that given
(0, 2) -, 37 n/a element may appear once, twice, or not at all; if it does not appear its value is 37, otherwise its value is that given; in general, minOccurs and maxOccurs' values may be positive integers, and maxOccurs' value may also be "unbounded"
(0, 0) -, - prohibited, - element/attribute must not appear

So far, we have described how to define new complex types (e.g. PurchaseOrderType), and declare elements (e.g. purchaseOrder) and attributes (e.g. orderDate). These activities generally involve naming, and the question naturally arises: What happens if two things are given the same name? The answer depends upon the two things in question, although in general the more similar are the two things, the more likely is there to be a conflict.

Here are some examples to illustrate when same names cause problems. If the two things are both types, say I define a complex type called US-States and a simple type called US-States, there is a conflict. If the two things are a type and an element or attribute, say I define a complex type called Address and I declare an element called Address, there is no conflict. If the two things are elements within different types (i.e. not global elements), say I declare one element called name as part of the Address type and a second element called name as part of the Item type, there is no conflict. (Such elements are sometimes called local element declarations). Finally, if the two things are both types and you define one and XML Schema has defined the other, say you define a simple type called decimal, there is no conflict. The reason for the apparent contradiction in the last example is that the two types belong to different namespaces. We'll explore the use of schema and namespaces in a later section.

2.3 Simple Types

The purchase order schema declares several elements and attributes that have simple types. Some of these simple types, such as string and decimal, are built-in to XML Schema, while others are derived from the built-in's. For example, the partNum attribute has a type called Sku that is derived from string. Both built-in simple types and their derivations can be used in all element and attribute declarations. Table 2 lists all the simple types built-in to XML Schema, along with an example of each type.

Table 2. Simple Types Built-In to XML Schema
Simple Type Example(s) Notes
string Confirm this is electric  
boolean true, false, 1, 0  
float -INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN equivalent to single-precision 32-bit floating point, NaN is "not a number"
double -INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN equivalent to double-precision 64-bit floating point
decimal -1.23, 0, 123.4, 1000.00  
timeInstant 1999-05-31T13:20:00.000-05:00 May 31st 1999 at 1.20pm Eastern Standard Time which is 5 hours behind Co-Ordinated Universal Time
timePeriod 1999-05-31T13:20  
month 1999-05 May 1999
year 1999 1999
century 19 the 1900's
recurringDate --05-31 every May 31st
recurringDay ----31 every 31st day
timeDuration P1Y2M3DT10H30M12.3S 1 year, 2 months, 3 days, 10 hours, 30 minutes, 12.3 seconds
recurringDuration --05-31T13:20:00 May 31st every year at 1.20pm Co-Ordinated Universal Time, format similar to timeInstant
binary 100010  
uriReference http://www.example.com/, http://www.example.com/doc.html#ID5  
ID   XML 1.0 ID attribute type
IDREF   XML 1.0 IDREF attribute type
ENTITY   XML 1.0 ENTITY attribute type
NOTATION   XML 1.0 NOTATION attribute type
language en-GB, en-US, fr valid values for xml:lang as defined in XML 1.0
IDREFS   XML 1.0 IDREFS attribute type
ENTITIES   XML 1.0 ENTITIES attribute type
NMTOKEN US XML 1.0 NMTOKEN attribute type
NMTOKENS US UK XML 1.0 NMTOKENS attribute type
Name shipTo XML 1.0 Name type
QName po:Address XML Namespace QName
NCName Address XML Namespace NCName, i.e. a QName without the prefix and colon
integer -126789, -1, 0, 1, 126789  
nonPositiveInteger -126789, -1, 0  
negativeInteger -126789, -1  
long -1, 12678967543233  
int -1, 126789675  
short -1, 12678  
byte -1, 126  
nonNegativeInteger 0, 1, 126789  
unsignedLong 0, 12678967543233  
unsignedInt 0, 1267896754  
unsignedShort 0, 12678  
unsignedByte 0, 126  
positiveInteger 1, 126789  
date 1999-05-31, ---05 5th day of every month
time 13:20:00.000, 13:20:00.000-05:00  
Note that to retain compatibility between XML Schema and XML 1.0 DTDs, the simple types ID, IDREF, IDREFS, ENTITY, ENTITIES, NOTATION, NMTOKEN, NMTOKENS should only be used in attributes.

New simple types are defined by derivation from existing simple types (built-in's and derived) through a technique called restriction. A new type must have a name different from the existing type, and the new type may constrain the legal range of values obtained from the existing type. We use the simpleType element to define a new simple type, and we can constrain its values by applying one or more "facets". A complete listing of facets is provided in Appendix B.

Suppose we wish to create a new type of integer called myInteger whose range of values is between 1 and 99 (inclusive). We base our definition on the built-in simple type integer, whose range of values also includes integers less than 1 and greater than 99. To define myInteger, we limit the range of the integer base type by employing two facets, minInclusive and maxInclusive:

Defining myInteger, Range 1-99
<xsd:simpleType name="MyInteger" base="xsd:integer">
  <xsd:minInclusive value="1"/>
  <xsd:maxInclusive value="99"/>
</xsd:simpleType>

The example shows one particular combination of a base type and a facet used to define myInteger, but a look at the list of built-in simple types and their facets should suggest other viable combinations.

The purchase order schema contains another, more elaborate, example of a simple type definition. A new simple type called Sku (shorthand for a product number) is derived from the simple type string. Furthermore, we constrain the values of Sku using a facet called pattern in conjunction with the regular expression "\d{3}-[A-Z]{2}" that is read "three digits followed by a hyphen followed by two upper-case letters":

Defining the Simple Type "Sku"
<xsd:simpleType name="Sku" base="xsd:string">
  <xsd:pattern value="\d{3}-[A-Z]{2}"/>
</xsd:simpleType>

This regular expression language is described more fully in Appendix C.

XML Schema defines fourteen facets which are listed in full in Appendix B. Among these, the enumeration facet is one the most useful and it can be used to constrain the values of almost every simple type, except the boolean type. The enumeration facet limits a simple type to a set of distinct values. For example, we can use the enumeration facet to define a new simple type called US-State, derived from string, whose value must be one of the standard US state abbreviations:

Using the Enumeration Facet
<xsd:simpleType name="US-State" base="xsd:string">
  <xsd:enumeration value="AK"/>
  <xsd:enumeration value="AL"/>
  <xsd:enumeration value="AR"/>
  <!-- and so on ... -->
</xsd:simpleType>

US-State would be a good replacement for the string type currently used in the state element declaration. By making this replacement, the legal values of a state element, i.e. the state subelements of billTo and shipTo, would be limited to one of AK, AL, AR, etc. Note that the enumeration values specified for a particular type must be unique.

The majority of simple types described in Table 2 are so-called atomic types, for example, decimal and NMTOKEN. The values of atomic types are indivisible from XML Schema's point of view. In contrast, XML Schema has three built-in list types that are comprised of sequences of atomic types. For example, NMTOKENS is a list type, and an element of this type would be a white-space delimited list of NMTOKEN's, such as "US UK FR". The three built-in lists types are NMTOKENS, IDREFS, and ENTITIES.

In addition to using the built-in list types, you can create new list types by derivation from existing atomic types. (You cannot create list types from complex types). For example, to create a list of myInteger's:

<xsd:simpleType name='ListOfMyIntType' base='myInteger' derivedBy='xsd:list'/>

And an element in an instance document whose content conforms to ListOfMyIntType is:

<listOfMyInt>47 25 99 3 25 1</listOfMyInt>

Several facets can be applied in the derivation of a new list type: length, minLength, maxLength, and enumeration. For example, to create a list of exactly six US states, we can derive a new list type from the US-State base type, defined above, and constrain the number of items to six:

List Type for Six US States
<xsd:simpleType name="SixUS-States" base="US-State" derivedBy="xsd:list">
  <xsd:length value="6"/>
</xsd:simpleType>

Elements declared as having this type must have six items, and each of the six items is one of the (atomic) values of the enumerated type US-State, for example:

<sixStates>PA NY CA NY LA AK</sixStates>

Note that it is possible to derive a list type from the atomic type string. However, a string may contain white space, and white space delimits the items in a list type, so you should be careful using fixed length list types whose base type is string. For example, suppose a list type is defined with a length facet equal to 3, and base type string, then the following 3 item list is legal:

Asia Europe Africa

But the following 3 "item" list is illegal:

Asia Europe South America

Even though "South America" may exist as a single string outside of the list, when it is included in the list, the whitespace between South and America effectively creates a fourth item, and so the latter example will not conform to the 3-item list type.

2.4 Anonymous Type Definitions

Schemas can be constructed by defining sets of named types such as PurchaseOrderType and then declaring elements such as purchaseOrder that reference the types using the type= construction. This style of schema construction is straightforward but it can be unwieldy, especially if you define many types that are referenced only once and contain very few constraints. In these cases, a type can be more succinctly defined as an anonymous type which saves the overhead of having to be named and explicitly referenced.

The definition of the type Items in po.xsd contains two element declarations that use anonymous types (item and quantity). In general, you can identify anonymous types by the lack of a "type=" in the element (or attribute) declaration, and the declaration containing an un-named type definition:

Two Anonymous Type Definitions
<xsd:complexType name="Items">
  <xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
    <xsd:complexType>
      <xsd:element name="productName" type="xsd:string"/>
      <xsd:element name="quantity">
        <xsd:simpleType base="xsd:positiveInteger">
          <xsd:maxExclusive value="100"/>
        </xsd:simpleType>
      </xsd:element>
      <xsd:element name="price" type="xsd:decimal"/>
      <xsd:element ref="comment" minOccurs="0"/>
      <xsd:element name="shipDate" type="xsd:date" minOccurs='0'/>
      <xsd:attribute name="partNum" type="Sku"/>
    </xsd:complexType>
  </xsd:element>
</xsd:complexType>

In the case of the item element, it has an anonymous complex type consisting of the elements productName, quantity, price, comment, and shipDate, and an attribute called partNum. In the case of the quantity element, it has an anonymous simple type derived from integer whose value ranges between 1 and 99.

2.5 Element Content

The purchase order schema has many examples of elements containing other elements (e.g. items), elements having attributes and containing other elements (e.g. shipTo), and elements containing only a simple type of value (e.g. price). However, we have not seen an element having attributes but containing only a simple type of value, nor have we seen an element that contains other elements mixed with character content, nor have we seen an element that has no content at all. In this section we'll examine these variations in the content models of elements.

2.5.1 Complex Types from Simple Types

Let us first consider how to declare an element that has an attribute and contains a simple value. In an instance document, such an element might appear as:

<internationalPrice currency='EU'>423.46</internationalPrice>

The purchase order schema declares a price element that is a starting point:

<xsd:element name="price" type="decimal"/>

Now, how do we add an attribute to this element? As we have said before, simple types cannot have attributes, and decimal is a simple type. Therefore, we must define a complex type to carry the attribute declaration. We also want the content to be simple type decimal. So our original question becomes: How do we define a complex type that is based on the simple type decimal? The answer is to derive a new complex type from the simple type decimal:

Deriving a Complex Type from a Simple Type
<xsd:element name='internationalPrice'>
    <xsd:complexType base='xsd:decimal' derivedBy='extension'>
        <xsd:attribute name='currency' type='xsd:string' />
    </xsd:complexType>
</xsd:element>

We use the complexType element to define the new (anonymous) type, and we refer to decimal in the base attribute to indicate it is the simple type from which we are deriving the new type. We add a currency attribute using a standard attribute declaration, and because we want to add this attribute to the simple type, we must signal our intent by stating derivedBy='extension'. (We cover type derivation in detail in Section 4). The price element declared in this way will appear in an instance as shown in the example above.

2.5.2 Empty Content

Now suppose that we want the price element to convey both the unit of currency and the price as attribute values rather than as separate attribute and content values. For example:

<internationalPrice currency='EU' value='423.46' />

Such an element has no content at all, we say that its content model is empty:

An Empty Complex Type
<xsd:element name='internationalPrice'>
    <xsd:complexType content='empty'>
        <xsd:attribute name='currency' type='xsd:string' />
        <xsd:attribute name='value'    type='xsd:decimal' />
    </xsd:complexType>
</xsd:element>

2.5.3 Mixed Content

The construction of the purchase order schema may be characterized as elements containing subelements, and the deepest subelements contain character data. XML Schema also provides for the construction of schemas where character data can appear alongside subelements, and character data is not confined to the deepest subelements. The latter style of construction is enabled through the mixed value of the content attribute.

To illustrate, consider the following snippet from a customer letter that uses some of the same elements as the purchase order:

Snippet of Customer Letter
<letterBody>
<salutation>Dear Mr.<name>Robert Smith</name>.</salutation>
Your order of <quantity>1</quantity> <productName>Baby
Monitor</productName> shipped from our warehouse on
<shipDate>1999-05-21</shipDate>. ....
</letterBody>

Notice the text appearing between elements and their child elements. Specifically, text appears between the elements salutation, quantity, productName and shipDate which are all children of letterBody, and text appears around the element name which is the child of a child of letterBody. The following snippet of a schema declares letterBody:

Snippet of Schema for Customer Letter
<xsd:element name='letterBody'>
    <xsd:complexType content='mixed'>
        <xsd:element name='salutation'>
            <xsd:complexType content='mixed'>
                <xsd:element name='name' type='xsd:string'/>
            </xsd:complexType>
        </xsd:element>
        <xsd:element name='quantity' type='xsd:positiveInteger'/>
        <xsd:element name='productName' type='xsd:string'/>
        <xsd:element name='shipDate' type='xsd:date' minOccurs='0'/>
        <!-- etc -->
    </xsd:complexType>
</xsd:element>

Note that the mixed model in XML Schema differs fundamentally from the mixed model in XML 1.0. Under the XML Schema mixed model, the order and number of child elements appearing in an instance must agree with the order and number of child elements specified in the model. In contrast, under the XML 1.0 mixed model, the order and number of child elements appearing in an instance cannot be constrained. In sum, XML Schema provides full schema validation of mixed models in contrast to the partial schema validation provided by XML 1.0.

2.5.4 Default Content

In previous sections, we have defined new complex types without reference to the content attribute, and so it is reasonable to ask what content model was used in those definitions. The default content model for a complex type is called elementOnly, i.e. the complex type may contain elements and attributes. In general, the content acceptable by mixed and elementOnly models is the same, except mixed models also accept character data appearing before, after and between elements.

elementOnly is the content model that applies when we derive complex types from other complex types, but when we derive a complex type from a simple type (as we did in Section 2.5.1), the content model is called textOnly. In fact, we can define a complex type in terms of textOnly:

A textOnly Complex Type
<xsd:element name='internationalPrice'>
    <xsd:complexType content='textOnly'>
        <xsd:attribute name='currency' type='xsd:string' />
    </xsd:complexType>
</xsd:element>

The content of the anonymous type defined in this way is unconstrained, so the element value may be 423.46, but legitimately it may be any other sequence of characters as well. In general it is probably better to avoid such unconstrained type definitions in favour of constrained type definitions such as decimal and string.

2.6 Annotations

XML Schema provides three elements for annotating schemas for the benefit of both human readers and applications. In the purchase order schema, we put a basic schema description and copyright information inside the documentation element, which is the recommended location for human readable material.

The appInfo element, which we did not use in the purchase order schema, can be used to provide information for tools, stylesheets and other applications. An interesting example using appInfo is one of the schema that describes the simple types in XML Schema Part 2: Datatypes. Information describing this schema, e.g. which facets are applicable to particular simple types, is represented inside appInfo elements, which was used by an application to automatically generate text for the XML Schema Part 2 document.

Both documentation and appInfo appear as subelements of annotation, which may itself appear at the beginning of most schema constructions. To illustrate, the following example shows annotation elements appearing at the beginning of an element declaration and a complex type definition:

Annotations in Element Declaration & Complex Type Definition
<xsd:element name='internationalPrice'>
    <annotation>
        <documentation>element declared with anonymous type</documentation>
    </annotation>
    <xsd:complexType content='empty'>
        <annotation>
            <documentation>empty anonymous type with 2 attributes</documentation>
        </annotation>
        <xsd:attribute name='currency' type='xsd:string' />
        <xsd:attribute name='value'    type='xsd:decimal' />
     </xsd:complexType>
</xsd:element>

The annotation element may also appear at the beginning of other schema constructions such as those indicated by the elements schema, simpleType, and attribute.

2.7 Building Content Models

The definitions of complex types in the purchase order schema all declare sequences of elements that must appear in the instance document. The occurrence of individual elements declared in the so-called content models of these types may be optional, as indicated by a 0 value for the attribute minOccurs (e.g. in comment), or otherwise constrained depending upon the values of minOccurs and maxOccurs. XML Schema also provides constraints that apply to groups of elements appearing in a content model. Note that the constraints do not apply to attributes. These constraints mirror those available in XML 1.0 plus some additional constraints.

XML Schema enables a group of elements to be defined and named, so that the elements can be used to build up the content models of complex types (thus mimicking common usage of parameter entities in XML 1.0). Un-named groups of elements can also be defined, and along with elements in named groups, they can be constrained to appear in the same order (sequence) as they are declared. Alternatively, they can be constrained so that only one of the elements may appear in an instance.

To illustrate, we modify the PurchaseOrderType definition from the purchase order schema using two groups so that purchase orders may contain either separate shipping and billing addresses, or a single address for those cases in which the shipper and biller are co-located:

Nested Choice and Sequence Groups
<xsd:complexType name="PurchaseOrderType">
  <xsd:choice>
    <xsd:group   ref="shipAndBill" />
    <xsd:element name="singleAddress" type="Address" />
  </xsd:choice>
  <xsd:element   ref="comment"    minOccurs="0" />
  <xsd:element   name="items"     type="Items" />
  <xsd:attribute name="orderDate" type="xsd:date" />
</xsd:complexType>

<xsd:group name="shipAndBill">
  <xsd:sequence>
    <xsd:element name="shipTo" type="Address" />
    <xsd:element name="billTo" type="Address" />
  </xsd:sequence>
</xsd:group>

A choice group element allows only one of its children to appear in an instance. One child is an inner group element that references the named group shipAndBill consisting of the element sequence shipTo, billTo, and the second child is a singleAddress. Hence, in an instance document, the purchaseOrder element must contain either a singleAddress element or a shipTo element followed by a billTo element. Note that the sequence element used in the definition of shipAndBill is not strictly necessary because the content model of a named group is a sequence by default.

There exists a third option for constraining elements in a group: All the elements in the group may appear once or not at all, and they may appear in any order. The all group (which provides a simplified version of the SGML &-Connector) is limited to the top-level of any content model. Moreover, the group's children must all be individual elements (no groups), and any element in the content model may appear no more than once, i.e. the permissible values of minOccurs and maxOccurs are 0 and 1. For example, to allow the child elements of purchaseOrder to appear in any order, we could redefine PurchaseOrderType as:

An 'All' Group
<xsd:complexType name="PurchaseOrderType">
  <xsd:all>
    <xsd:element name="shipTo" type="Address"/>
    <xsd:element name="billTo" type="Address"/>
    <xsd:element ref="comment" minOccurs="0"/>
    <xsd:element name="items"  type="Items" />
  </xsd:all>
  <xsd:attribute name="orderDate" type="xsd:date" />
</xsd:complexType>

By this definition, a comment element may optionally appear within purchaseOrder, and it may appear before or after any shipTo, billTo and items elements, but it can appear only once. Moreover, the stipulations of an all group do not allow us to declare an element such as comment outside the group as a means of enabling it to appear more than once. XML Schema stipulates that an all group must appear as the sole child at the top of a content model. In other words, the following is illegal:

Illegal Example with an 'All' Group
<xsd:complexType name="PurchaseOrderType">
  <xsd:all>
    <xsd:element name="shipTo" type="Address"/>
    <xsd:element name="billTo" type="Address"/>
    <xsd:element name="items"  type="Items" />
  </xsd:all>
  <xsd:element   ref="comment"    minOccurs="0" maxOccurs="unbounded"/>
  <xsd:attribute name="orderDate" type="xsd:date" />
</xsd:complexType>

Finally, named and un-named groups that appear in content models (represented by group and choice, sequence, all respectively) may carry minOccurs and maxOccurs attributes. By combining and nesting the various groups provided by XML Schema, and by setting the values of minOccurs and maxOccurs, it is possible to represent any content model expressible with an XML 1.0 DTD. Furthermore, the all group provides additional expressive power.

2.8 Attribute Groups

Suppose we want to provide more information about the items in a purchase order, by adding attributes to the item element indicating whether or not the item is in stock, weight, and preferred shipping method. One way to add these attributes is to add more attribute declarations to the Item's type definition:

Adding Attributes to the Inline Type Definition
<xsd:element name="Item" minOccurs="0" maxOccurs="unbounded">
   <xsd:complexType>
    <xsd:element   name="productName" type="xsd:string"/>
    <xsd:element   name="quantity">
     <xsd:simpleType base="xsd:positiveInteger">
      <xsd:maxExclusive value="100"/>
     </xsd:simpleType>
    </xsd:element>
    <xsd:element   name="price"    type="xsd:decimal"/>
    <xsd:element   ref="comment"   minOccurs="0"/>
    <xsd:element   name="shipDate" type="xsd:date" minOccurs='0'/>
    <xsd:attribute name="partNum"  type="Sku"/>
    <xsd:attribute name="weight"   type="xsd:decimal"/>
    <xsd:attribute name="shipBy">
     <xsd:simpleType base="string">
       <xsd:enumeration value="air"/>
       <xsd:enumeration value="land"/>
       <xsd:enumeration value="any"/>
     </xsd:simpleType>
    </xsd:attribute>
   </xsd:complexType>
</xsd:element>

Alternatively, we can create a named attribute group containing these attributes and reference this group by name in the item element declaration:

Adding Attributes Using an Attribute Group
<xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
   <xsd:complexType>
    <xsd:element name="productName" type="xsd:string"/>
    <xsd:element name="quantity">
     <xsd:simpleType base="xsd:positiveInteger">
      <xsd:maxExclusive value="100"/>
     </xsd:simpleType>
    </xsd:element>
    <xsd:element name="price"    type="xsd:decimal"/>
    <xsd:element ref="comment"   minOccurs="0"/>
    <xsd:element name="shipDate" type="xsd:date" minOccurs='0'/>
    <xsd:attributeGroup ref="ItemDelivery"/>
   </xsd:complexType>
</xsd:element>
<xsd:attributeGroup name="ItemDelivery">
  <xsd:attribute name="partNum" type="Sku"/>
  <xsd:attribute name="weight"  type="xsd:decimal"/>
  <xsd:attribute name="shipBy">
    <xsd:simpleType base="xsd:string">
      <xsd:enumeration value="air"/>
      <xsd:enumeration value="land"/>
      <xsd:enumeration value="any"/>
    </xsd:simpleType>
  </xsd:attribute>
</xsd:attributeGroup>

Using an attribute group in this way can improve the readability of schema, and facilitates updating schema because an attribute group can be defined and edited in one place and referenced in multiple definitions and declarations. These characteristics of attribute groups make them similar to parameter entities in XML 1.0. Note that both attribute declarations and attribute group references must appear at the end of complex type definitions.

2.9 Null Values

One of the purchase order items listed in po.xml, the Lawnmower, does not have a shipDate element. Within the context of our scenario, the schema author may have intended such absences to indicate items not yet shipped. But in general, the absence of an element does not have any particular meaning: It may indicate that the information is unknown, or not applicable, or the element may be absent for some other reason. Sometimes it is desirable to represent an unshipped item, unknown information, or inapplicable information explicitly with an element, rather than by an absent element. For example, it may be desirable to represent a "null" value being sent to or from a relational database with an element that is present. Such cases can be represented using XML Schema's null mechanism which enables an element to appear with or without a non-null value.

XML Schema's null mechanism involves an "out of band" null signal. In other words, there is no actual null value that appears as element content, instead there is an attribute to indicate that the element content is null. To illustrate, we can modify the shipDate element declaration so that nulls can be signalled:

<xsd:element name="shipDate" type="xsd:date" nullable="true"/>

And to explictly represent that shipDate has a null value in the instance document, we set the null attribute (from the XML Schema namespace for instances) to true:

<shipDate xsi:null="true"></shipDate>

The null attribute is defined as part of the XML Schema namespace for instances (http://www.w3.org/1999/XMLSchema-instance), and so it must appear in the instance document with a prefix (xsi:) associated with that namespace. (As with the xsd: prefix, the xsi: prefix is used by convention only). Note that the null mechanism applies only to element values, and not to attribute values. An element with xsi:null="true" may not have any element content but it may still carry attributes.

3. Advanced Concepts I: Namespaces, Schemas & Qualification

A schema can be viewed as a collection (vocabulary) of type definitions and element declarations whose names belong to a particular namespace called a target namespace. The target namespace enables us to distinguish between definitions and declarations from different vocabularies. For example, target namespaces would enable us to distinguish between the declaration for element in the XML Schema language vocabulary, and a declaration for element in a hypothetical chemistry language vocabulary. The former is part of the http://www.w3.org/1999/XMLSchema target namespace, and the latter is part of another target namespace.

When we want to check that an instance document conforms to one or more schemas (through a process called schema validation), we need to identify which element and attribute declarations and type definitions in the schemas should be used to check which elements and attributes in the instance document. The target namespace plays an important role in the identification process. We examine the role of the target namespace in the next section.

The schema author also has several options that affect how the identities of elements and attributes are represented in instance documents. More specifically, the author can decide whether or not the appearance of locally declared elements and attributes in an instance must be qualified by a namespace, using either an explicit prefix or implicitly by default. The schema author's choice regarding qualification of local elements and attributes has a number of implications regarding the structures of schemas and instance documents, and we examine some of these implications in the following sections.

3.1 Target Namespaces & Unqualified Locals

In a new version of the purchase order schema (po1.xsd), we explicitly declare a target namespace, and specify that both locally defined elements and locally defined attributes must be unqualified. The target namespace in po1.xsd is http://www.example.com/PO1, as indicated by the value of the targetNamespace attribute.

Qualification of local elements and attributes can be globally specified by a pair of attributes, elementFormDefault and attributeFormDefault, on the schema element, or can be specified separately for each local declaration using the form attribute. All such attributes' values may each be set to unqualified or qualified, to indicate whether or not locally declared elements and attributes must be unqualified.

In po1.xsd we globally specify the qualification of elements and attributes by setting the values of both elementFormDefault and attributeFormDefault to qualified. Strictly speaking, this is unnecessary because these are the default values of the two attributes, but we do so to highlight the contrast between this case and others we describe in subsequent sections.

Purchase Order Schema with Target Namespace, po1.xsd
<schema xmlns="http://www.w3.org/1999/XMLSchema"
        xmlns:po="http://www.example.com/PO1"
        targetNamespace="http://www.example.com/PO1"
        elementFormDefault="unqualified"
        attributeFormDefault="unqualified">

 <element name="purchaseOrder" type="po:PurchaseOrderType"/>
 <element name="comment"       type="string"/>

 <complexType name="PurchaseOrderType">
  <element name="shipTo" type="po:Address"/>
  <element name="billTo" type="po:Address"/>
  <element ref="po:comment" minOccurs="0"/>
  <!-- etc -->
 </complexType>

 <complexType name="Address">
  <element name="name"   type="string"/>
  <element name="street" type="string"/>
  <!-- etc -->
 </complexType>

 <!-- etc -->

</schema>

To see how the target namespace of this schema is populated, we'll examine in turn each of the type definitions and element declarations. Starting from the end of the schema, we first define a type called Address that consists of the elements name, street, etc. One consequence of this type definition is that the Address type is included in the schema's target namespace. We next define a type called PurchaseOrderType that consists of the elements shipTo, billTo, comment, etc. PurchaseOrderType is also included in the schema's target namespace. Notice that the type references in the three element declarations are prefixed, i.e. po:Address, po:Address and po:comment, and the prefix is associated with the namespace http://www.example.com/PO1. This is the same namespace as the schema's target namespace, and so a processor of this schema will know to look within this schema for the definition of the type Address and the declaration of the element comment. It is also possible to refer to types in another schema with a different target namespace, hence enabling re-use of definitions and declarations between schemas.

At the beginning of the schema po1.xsd, we declare the elements purchaseOrder and comment. They are included in the schema's target namespace. The purchaseOrder element's type is prefixed, for the same reason that Address is prefixed. In contrast, the comment element's type, string, is not prefixed. The po1.xsd schema contains a default namespace declaration and so unprefixed types such as string, and unprefixed elements such as element and complexType, are associated with the default namespace, http://www.w3.org/1999/XMLSchema. In fact, this is the target namespace of XML Schema itself, and so a processor of po1.xsd will know to look within the schema of XML Schema (otherwise known as the "schema for schemas") for the definition of the type string and the declaration of the element called element.

Let us now examine how the target namespace of the schema affects a conforming instance document:

A Purchase Order with Unqualified Locals, po1.xml
<?xml version="1.0"?>
<apo:purchaseOrder xmlns:apo="http://www.example.com/PO1"
                   orderDate="1999-10-20">
    <shipTo country="US">
        <name>Alice Smith</name>
        <street>123 Maple Street</street>
        <!-- etc -->
    </shipTo>
    <billTo country="US">
        <name>Robert Smith</name>
        <street>8 Oak Avenue</street>
        <!-- etc -->
    </billTo>
    <apo:comment>Hurry, my lawn is going wild!</apo:comment>
    <!-- etc -->
</apo:purchaseOrder>

The instance document declares one namespace, http://www.example.com/PO1, and associates it with the prefix apo:. This prefix is used to qualify two elements in the document, namely purchaseOrder and comment. The namespace is the same as the target namespace of the schema in po1.xsd, and so a processor of the instance document will know to look in that schema for the declarations of purchaseOrder and comment. In fact, target namespaces are so named because of the sense in which there exists a target namespace for the elements purchaseOrder and comment. Target namespaces in the schema therefore control the validation of corresponding namespaces in the instance.

The prefix apo: is applied to the global elements purchaseOrder and comment elements. Furthermore, elementFormDefault and attributeFormDefault require that the prefix is not applied to any of the the locally declared elements such as shipTo, billTo, name and street, and it is not applied to any of the attributes (which were all declared locally). The purchaseOrder and comment are global elements because they are declared in the context of the schema as a whole rather than within the context of a particular type. For example, the declaration of purchaseOrder appears as a child of the schema element in po1.xsd, whereas the declaration of shipTo appears as a child of the complexType element that defines Address.

When local elements and attributes are not required to be qualified, an instance author may require more or less knowledge about the details of the schema to create schema valid instance documents. More specifically, if the author can be sure that only the root element (such as purchaseOrder) is global, then it is a simple matter to qualify only the root element. Alternatively, the author may know that all the elements are declared globally, and so all the elements in the instance document can be prefixed, perhaps taking advantage of a default namespace declaration. (We examine this approach in Section 3.3). On the other hand, if there is no uniform pattern of global and local declarations, the author will need detailed knowledge of the schema to correctly prefix global elements (and attributes).

3.2 Qualified Locals

Elements and attributes can be independently required to be qualified, although we'll start by describing qualification of local elements. To specify that all locally declared elements in a schema must be qualified, we set the value of elementFormDefault to qualified:

Modifications to po1.xsd for Qualified Locals
<schema xmlns="http://www.w3.org/1999/XMLSchema"
        xmlns:po="http://www.example.com/PO1"
        targetNamespace="http://www.example.com/PO1"
        elementFormDefault="qualified"
        attributeFormDefault="unqualified">

 <element name="purchaseOrder" type="po:PurchaseOrderType"/>
 <element name="comment"       type="string"/>

 <complexType name="PurchaseOrderType">
  <!-- etc -->
 </complexType>

 <!-- etc -->

</schema>

And in this conforming instance document, we qualify all the elements explicitly:

A Purchase Order with Explicitly Qualified Locals
<?xml version="1.0"?>
<apo:purchaseOrder xmlns:apo="http://www.example.com/PO1"
                   orderDate="1999-10-20">
    <apo:shipTo country="US">
        <apo:name>Alice Smith</apo:name>
        <apo:street>123 Maple Street</apo:street>
        <!-- etc -->
    </apo:shipTo>
    <apo:billTo country="US">
        <apo:name>Robert Smith</apo:name>
        <apo:street>8 Oak Avenue</apo:street>
        <!-- etc -->
    </apo:billTo>
    <apo:comment>Hurry, my lawn is going wild!</apo:comment>
    <!-- etc -->
</apo:purchaseOrder>

Alternatively, we can replace the explicit qualification of every element with implicit qualification provided by a default namespace, as shown here in po2.xml:

A Purchase Order with Default Qualified Locals, po2.xml
<?xml version="1.0"?>
<purchaseOrder xmlns="http://www.example.com/PO1"
                   orderDate="1999-10-20">
    <shipTo country="US">
        <name>Alice Smith</name>
        <street>123 Maple Street</street>
        <!-- etc -->
    </shipTo>
    <billTo country="US">
        <name>Robert Smith</name>
        <street>8 Oak Avenue</street>
        <!-- etc -->
    </billTo>
    <comment>Hurry, my lawn is going wild!</comment>
    <!-- etc -->
</purchaseOrder>

In po2.xml, all the elements in the instance belong to the same namespace, and the namespace statement declares a default namespace that applies to all the elements in the instance. Hence, it is unnecessary to explicitly prefix any of the elements. As another illustration of using qualified elements, the schemas in Section 5 all require qualified elements.

Qualification of attributes is very similar to the qualification of elements. Attributes that must be qualified, either because they are declared globally or because the attributeFormDefault attribute is set to qualified, appear prefixed in instance documents. One example of a qualified attribute is the xsi:null attribute that was introduced in Section 2.9. In fact, attributes that are required to be qualified must be explicitly prefixed because the XML-Namespaces specification does not provide a mechanism for defaulting the namespaces of attributes. Attributes that are not required to be qualified appear in instance documents without prefixes, which is the typical case.

The qualification mechanism we have described so far has controlled all local element and attribute declarations within a particular target namespace. It is also possible to control qualification on a declaration by declaration basis using the form attribute. For example, to require that the locally declared attribute publicKey is qualified in instances, we declare it in the following way:

Requiring Qualification of Single Attribute
<schema xmlns="http://www.w3.org/1999/XMLSchema"
        xmlns:po="http://www.example.com/PO1"
        targetNamespace="http://www.example.com/PO1"
				elementFormDefault="qualified"
				attributeFormDefault="unqualified">
 <!-- etc -->
 <element name="secure">
  <complexType>
    <!-- element declarations -->
    <attribute name="publicKey" type="binary" form="qualified">
  </complexType>
 </element>
</schema>

Notice that the value of the form attribute overides the value of the attributeFormDefault attribute for the publicKey attribute only. Also, the form attribute can be applied to an element declaration in the same manner. An instance document that conforms to the schema is:

Instance with a Qualified Attribute
<?xml version="1.0"?>
<purchaseOrder xmlns="http://www.example.com/PO1"
                   xmlns:po="http://www.example.com/PO1"
                   orderDate="1999-10-20">
    <!-- etc -->
    <secure po:publicKey="11110000111110100010">
        <!-- etc -->
    </secure>
</purchaseOrder>

3.3 Global vs. Local Declarations

Another authoring style, when all the element names are unique within a namespace, is to create a schema in which all elements are global. This is similar in effect to the use of <!ELEMENT> in a DTD. In the example below, we have modified po1.xsd such that all the elements are declared globally. Notice that we have omitted the elementFormDefault and attributeFormDefault attributes in this example to emphasise that their values are irrelevant when there are only global element and attribute declarations.

Modified version of po1.xsd using only global element declarations
<schema xmlns="http://www.w3.org/1999/XMLSchema"
        xmlns:po="http://www.example.com/PO1"
        targetNamespace="http://www.example.com/PO1">

 <element name="purchaseOrder" type="po:PurchaseOrderType"/>

 <element name="shipTo"  type="po:Address"/>
 <element name="billTo"  type="po:Address"/>
 <element name="comment" type="string"/>
 
 <element name="name" type="string"/>
 <element name="street" type="string"/>

 <complexType name="PurchaseOrderType">
  <element ref="po:shipTo"/>
  <element ref="po:billTo"/>
  <element ref="po:comment" minOccurs="0"/>
  <!-- etc -->
 </complexType>

 <complexType name="Address">
  <element ref="po:name"/>
  <element ref="po:street"/>
  <!-- etc -->
 </complexType>

 <!-- etc -->

</schema>

This "global" version of po1.xsd will validate the instance document po2.xml which, as we described previously, is also schema valid against the "qualified" version of po1.xsd. In other words, both schema approaches can validate the same, namespace defaulted, document. Thus, in one respect the two schema approaches are similar, although in another important respect the two schema approaches are very different. Specifically, when all elements are declared globally, it is not possible to take advantage of local names. For example, you can only declare one global element called "title". However, you can locally declare one element called "title" that has a string type, and is a subelement of "book"; And within the same schema (target namespace) you can declare a second element also called "title" that is an enumeration of the values "Mr Mrs Ms".

3.4 Undeclared Target Namespaces

In Section 2 we explained the basics of XML Schema using a schema that did not declare a target namespace and an instance document that did not declare a namespace. So the question naturally arises: What is the target namespace in these examples and how is it referenced?

In the purchase order schema, po.xsd, we did not declare a target namespace for the schema, nor did we declare a prefix (like po: above) associated with the schema's target namespace with which we could refer to types and elements defined and declared within the schema. The consequence of not declaring a target namespace in a schema is that the definitions and declarations from that schema, such as Address and purchaseOrder, are referenced without namespace qualification. In other words there is no explicit namespace prefix applied to the references nor is there any implicit namespace applied to the reference by default. So for example, the purchaseOrder element is declared using the type reference PurchaseOrderType. In contrast, all the XML Schema elements and types used in po.xsd are explicitly qualified with the prefix xsd: that is associated with the XML Schema namespace.

Element declarations from a schema with no target namespace validate unqualified elements in the instance document. That is, they validate elements for which no namespace qualification is provided by either an explicit prefix or by default (xmlns:). So, to validate a traditional XML 1.0 document which does not use namespaces at all, you must provide a schema with no target namespace. Of course, there are many XML 1.0 documents that do not use namespaces, so there will be many schema documents written without target namespaces; you must be sure to give to your processor a schema document that corresponds to the vocabulary you wish to validate.

4. Advanced Concepts II: The International Purchase Order

The purchase order schema described in Chapter 2 was contained in a single document, and most of the schema constructions-- such as element declarations and type definitions-- were constructed from scratch. In reality, schema authors will want to compose schemas from constructions located in multiple documents, and create new types based on existing types. In this section, we examine mechanisms that enable such compositions and creations.

4.1 A Schema in Multiple Documents

As schemas become larger, it is often desirable to divide their content among several schema documents for purposes such as ease of maintenance, access control, and readability. For these reasons, we have taken the schema constructs concerning addresses out of po.xsd, and put them in a new file called address.xsd. The modified purchase order schema file is called ipo.xsd:

The International Purchase Order Schema, ipo.xsd
<schema targetNamespace="http://www.example.com/IPO"
        xmlns="http://www.w3.org/1999/XMLSchema"
        xmlns:ipo="http://www.example.com/IPO>

 <annotation>
  <documentation>
   International Purchase order schema for Example.com
   Copyright 2000 Example.com. All rights reserved.
  </documentation> 
 </annotation>

 <!-- include address constructs -->
 <include
  schemaLocation="http://www.example.com/schemas/address.xsd"/>

 <element name="purchaseOrder" type="ipo:PurchaseOrderType"/>

 <element name="comment" type="string"/>

 <complexType name="PurchaseOrderType">
  <element name="shipTo"     type="ipo:Address"/>
  <element name="billTo"     type="ipo:Address"/>
  <element ref="ipo:comment" minOccurs="0"/>
  <element name="Items"      type="ipo:Items"/>
  <attribute name="orderDate" type="date"/>
 </complexType>

 <complexType name="Items">
  <element name="item" minOccurs="0" maxOccurs="unbounded">
   <complexType>
    <element name="productName" type="string"/>
    <element name="quantity">
     <simpleType base="positiveInteger">
      <maxExclusive value="100"/>
     </simpleType>
    </element>
    <element name="price"      type="decimal"/>
    <element ref="ipo:comment" minOccurs="0"/>
    <element name="shipDate"   type="date" minOccurs='0'/>
    <attribute name="partNum"  type="ipo:Sku"/>
   </complexType>
  </element>
 </complexType>

 <simpleType name="Sku" base="string">
  <pattern value="\d{3}-[A-Z]{2}"/>
 </simpleType>

</schema>

The file containing the address constructs is:

Addresses for International Purchase Order schema, address.xsd
<schema targetNamespace="http://www.example.com/IPO"
        xmlns="http://www.w3.org/1999/XMLSchema"
        xmlns:ipo="http://www.example.com/IPO">

 <annotation>
  <documentation>
   Addresses for International Purchase order schema
   Copyright 2000 Example.com. All rights reserved.
  </documentation> 
 </annotation>

 <complexType name="Address">
  <element name="name"   type="string"/>
  <element name="street" type="string"/>
  <element name="city"   type="string"/>
 </complexType>

 <complexType name="US-Address" base="ipo:Address"
              derivedBy="extension">
  <element name="state" type="ipo:US-State"/>
  <element name="zip"   type="positiveInteger"/>
 </complexType>

 <complexType name="UK-Address" base="ipo:Address"
              derivedBy="extension">
  <element   name="postcode" type="ipo:UK-Postcode"/>
  <attribute name="export-code" type="positiveInteger"
             use="fixed" value="1"/>
 </complexType>

 <!-- other Address derivations for more countries --> 

 <simpleType name="US-State" base="string">
  <enumeration value="AK"/>
  <enumeration value="AL"/>
  <enumeration value="AR"/>
  <!-- and so on ... -->
 </simpleType>

 <!-- simple type definition for UK-Postcode -->

</schema>

The various purchase order and address constructions are now contained in two schema files, ipo.xsd and address.xsd. To include these constructions as part of the international purchase order schema, in other words to include them in the international purchase order's namespace, ipo.xsd contains the include element:

<include schemaLocation="http://www.example.com/schemas/address.xsd"/>

The effect of this include element is to bring in the definitions and declarations contained in address.xsd, and make them available as part of the international purchase order schema target namespace. The one important caveat to using include is that the target namespace of the included constructions must be the same as the target namespace of the including schema, in this case http://www.example.com/IPO.

In this example, we have shown only one including document and one included document. In practice it is possible to include more than one document using multiple include elements, and documents can include documents that themselves include other documents. However, nesting is legal only if all the included parts of the schema are declared with the same target namespace.

Instance documents that conform to schema whose definitions span multiple schema documents need only reference the 'topmost' document, and the common namespace, and it is the responsibility of the processor to gather together all the definitions specified in the various included documents. So in our example, the instance document ipo.xml (see Section 4.3) references only the common target namespace, http://www.example.com/IPO, and the one schema file http://www.example.com/schemas/ipo.xsd. The processor is responsible for obtaining the schema file address.xsd.

In Section 5.4 we describe how schemas can be used to validate content from more than one namespace.

4.2 Deriving Types by Extension

To create our address constructs, we start by creating a complex type called Address in the usual way (see address.xsd). The Address type contains the basic elements of an address: a name, a street and a city. From this starting point we derive two new complex types that contain all the elements of the original type plus additional elements that are specific to addresses in the US and the UK. The technique we use here to derive new (complex) address types by extending an existing type is the same technique we used in in Section 2.5.1, except that our base type here is a complex type whereas our base type in the previous section was a simple type.

We create the two new complex types, US-Address and UK-Address, using the complexType element along with values for base and derivedBy attributes. When a complex type is derived by extension, its effective content model is the content model of the base type plus the content model specified in the type derivation. Furthermore, the two content models are treated as two children of a sequential group. In the case of UK-Address, the content model of UK-Address is the content model of Address plus the declarations for a postcode element and an export-code attribute. This is like defining the UK-Address from scratch as follows:

Example
 <complexType name="UK-Address">
   <sequence>
      <!-- content model of Address -->
      <element name="name" type="string"/>
      <element name="street" type="string"/>
      <element name="city" type="string"/>

      <!-- appended declarations --> 
      <element name="postcode" type="ipo:UK-Postcode"/>
      <attribute name="export-code" type="positiveInteger"
                 use="fixed" value="1"/>
   </sequence>							 
 </complexType>

4.3 Using Derived Types in Instance Documents

In our example scenario, purchase orders are generated in response to customer orders which may involve shipping and billing addresses in different countries. The international purchase order, ipo.xml below, illustrates one such case where goods are shipped to England and the bill is sent to a US address. Clearly it is very useful if the schema for international purchase orders does not have to spell out every possible combination of international addresses for billing and shipping, and even more so if we can add new complex types of international address simply by creating new derivations of Address.

XML Schema allows us to define the billTo and shipTo elements as Address types (see ipo.xsd) but to use instances of international addresses in place of instances of Address. In other words, an instance document whose content conforms to the UK-Address type will be valid if that content appears within the document at a location where an Address is expected (assuming the UK-Address content itself is valid). To make this feature of XML Schema work, and to identify exactly which derived type is intended, the derived type must be identified in the instance document. The type is identified using the xsi:type attribute which is part of the XML Schema instance namespace. In the example, ipo.xml, use of the UK-Address and US-Address derived types is identified through the values assigned to the xsi:type attributes.

An International Purchase order, ipo.xml
<?xml version="1.0"?>
<ipo:purchaseOrder
  xmlns:xsi='http://www.w3.org/1999/XMLSchema-instance'
  xmlns:ipo="http://www.example.com/IPO"
  orderDate="1999-12-01">

    <shipTo export-code="1" xsi:type="ipo:UK-Address">
        <name>Helen Zoe</name>
        <street>47 Eden Street</street>
        <city>Cambridge</city>
        <postcode>CB1 1JR</postcode>
    </shipTo>

    <billTo xsi:type="ipo:US-Address">
        <name>Robert Smith</name>
        <street>8 Oak Avenue</street>
        <city>Old Town</city>
        <state>PA</state>
        <zip>95819</zip>
    </billTo>

    <items>
        <item partNum="833-AA">
            <productName>Lapis necklace</productName>
            <quantity>1</quantity>
            <price>99.95</price>
            <ipo:comment>Want this for the holidays!</ipo:comment>
            <shipDate>1999-12-05</shipDate>
        </item>
    </items>
</ipo:purchaseOrder>

In Section 4.7 we'll see how to prevent derived types from being used in this sort of substitution.

4.4 Deriving Complex Types by Restriction

In addition to deriving new complex types by extending content models, it is also possible to derive new types by restricting the content models of existing types. Restriction of complex types is conceptually the same as restriction of simple types, except that the restriction of complex types involves a type's declarations rather than the acceptable range of a simple type's values. A complex type derived by restriction is very similar to its base type, except that its declarations are more limited than the corresponding declarations in the base type. In fact, the values represented by the new type are a subset of the values represented by the base type (as is also the case with restriction of simple types). In other words, an application prepared for the values of the base type would not be surprised by the values of the restricted type.

For example, suppose we want to update our definition of the list of items in an international purchase order so that it must contain at least one item on order; The schema shown in ipo.xsd allows an items element to appear without any child item elements. To