W3C

XML Schema Part 1: Structures

W3C Working Draft 17 December 1999

This version:
http://www.w3.org/TR/1999/WD-xmlschema-1-19991217/
(in XML (with its own DTD, XSL stylesheet (Nov REC version) and IE5 stylesheet (XSL as supported by version 5 of Microsoft's Internet Explorer)) and HTML, with separate provision of the schema and DTD for schemas described herein.
Latest version:
http://www.w3.org/TR/xmlschema-1/
Previous versions:
http://www.w3.org/TR/1999/WD-xmlschema-1-19991105/
http://www.w3.org/TR/1999/WD-xmlschema-1-19990924/
http://www.w3.org/1999/05/06-xmlschema-1/
Editors:
Henry S. Thompson (University of Edinburgh) <ht@cogsci.ed.ac.uk>
David Beech (Oracle Corp.) <dbeech@us.oracle.com>
Murray Maloney (Commerce One) <murray@muzmo.com>
Noah Mendelsohn (Lotus Development Corporation) <Noah_Mendelsohn@lotus.com>

Copyright ©1999 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.


Abstract

XML Schema: Structures is part 1 of a two-part draft of the specification for the XML Schema definition language. This document proposes facilities for describing the structure and constraining the contents of XML 1.0 documents. The schema language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs).

Status of this document

This is a public working draft of XML Schema 1.0 for review by the public and by members of the World Wide Web Consortium.

It has been reviewed by the XML Schema Working Group, and the Working Group has agreed to its publication. The WG believes this draft to be `feature-complete': the functionality included here is substantially complete and is expected to be stable. We do not expect to add major new functionality, or to make major changes to the functionality described in this draft. Some sections of the draft (in particular those on conformance), and some aspects of the design (in particular details of the transfer syntax for schemas), on the other hand, are still rough and are expected to be revised.

The WG expects to spend January, 2000, working out details, clarifying points of uncertainty that arise in the review of this draft, cleaning up inconsistencies, reviewing the design of the concrete transfer syntax, and making editorial improvements.

Following that period of review and polishing, it is the WG's intent to issue a Last Call for Review by other W3C working groups sometime during February, 2000, and to submit this specification in March, 2000, for publication as a Candidate Recommendation. This schedule may vary, depending on the comments of the public and of other W3C working groups on this draft. Such comments are instrumental in the WG's deliberations, and we encourage readers to review the draft and send comments to www-xml-schema-comments@w3.org (archive).

Although the Working Group does not anticipate further substantial changes to the functionality described here, this is still a working draft, subject to change based on experience and on comment by the public and other W3C working groups. The present version should be implemented only by those interested in providing a check on its design or by those preparing for an implementation of the Candidate Recommendation. The Schema WG will not allow early implementation to constrain its ability to make changes to this specification prior to final release.

A list of current W3C working drafts can be found at http://www.w3.org/TR/. They may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress".

Table of contents

1 Introduction
    1.1 Documentation Conventions
    1.2 Purpose
    1.3 Relationship To Other Work
    1.4 Terminology
2 Conceptual Framework
    2.1 Kinds of XML Documents
    2.2 On schemas, constraints and contributions
    2.3 Schemas, Types and Elements
    2.4 Schemas and their component parts
    2.5 Names and Symbol Spaces
    2.6 Referencing Schema Components
    2.7 Association of components with a target namespace
        2.7.1 Association of definitions with a target namespace
        2.7.2 Providing a target namespace for definitions and declarations
    2.8 Abstract and Concrete Syntax
3 Schema Definitions and Declarations
    3.1 The Schema
    3.2 The Document and its Root
    3.3 References to Schema Constructs
    3.4 Types, Elements and Attributes
        3.4.1 Simple Type Definition
        3.4.2 Complex Type Definition
        3.4.3 Attribute Declaration
        3.4.4 Attribute Group Definition
        3.4.5 Element Content Model
        3.4.6 Rich Content Models
        3.4.7 Mixed Content
        3.4.8 Named Model Group
        3.4.9 Element Declaration
    3.5 Wildcards
    3.6 Deriving Type Definitions
        3.6.1 Deriving type definitions by extension
        3.6.2 Deriving type definitions by restriction
        3.6.3 Controlling derivation
        3.6.4 Reinterpreting Content Models
        3.6.5 Element Equivalence Classes
        3.6.6 The ur-type
        3.6.7 Graveyard for stale syntax, here to avoid breaking IDREFs elsewhere *
    3.7 Unique, key and key reference constraints
    3.8 Notations
        3.8.1 Notation Declaration
4 Schema Access and Composition
    4.1 Layer 1: Summary of the schema-validation core
    4.2 Layer 2: Schema definitions in XML
        4.2.1 Assembling a schema for a single namespace from multiple schema definition documents
        4.2.2 References to schema components across namespaces
    4.3 Layer 3: Web-interoperability
        4.3.1 Standards for representation and retrieval of schema definitions on the Web
        4.3.2 How schema definitions are located on the Web
5 Annotating schemas
6 Conformance *
    6.1 Schema Validity *
    6.2 Detailed validity constraints and definitions *
        6.2.1 The Schema *
        6.2.2 References to Schema Constructs *
        6.2.3 Types, Elements and Attributes *
        6.2.4 Type Refinement *
        6.2.5 Import Restrictions *
        6.2.6 Schema Inclusion *
        6.2.7 Schema Validity *
    6.3 Responsibilities of Schema-aware processors *
    6.4 Lexical representation *
    6.5 Information set *

Appendices

A (normative) Schema for Schemas
B (normative) DTD for Schemas
C Glossary (normative) *
D References (normative) *
E Acknowledgments (non-normative)
F Sample Schema (non-normative)
G Tabulation of changes
H Open Issues

1 Introduction

This document sets out the structural part (XML Schema: Structures) of the XML Schema definition language.

Chapter 2 presents a Conceptual Framework (§2) for XML Schema: Structures, including an introduction to schema constraints, types, schema composition, and symbol spaces. The abstract and concrete syntax of XML Schema: Structures are introduced, along with other terminology used throughout the specification.

Chapter 3 Schema Definitions and Declarations (§3) reconstructs the core functionality of XML 1.0, plus a number of extensions, in line with our stated requirements [XML Schema Requirements]. This chapter discusses the declaration and use of simple and complex types, elements, content models, attributes, attribute groups, model groups and inheritance.

Chapter 4 presents Schema Access and Composition (§4), including the validation of namespace qualified instance documents, import and inclusion of declarations and definitions, access to schemas, and the foundations of schema-validity.

Chapter 5 describes provision for including documentation in the definition of a schema.

Chapter 6 discusses Conformance * (§6), including the rules by which instance documents are validated, and responsibilities of schema-aware processors.

The normative addenda include a (normative) DTD for Schemas (§B) and a (normative) Schema for Schemas (§A), which is an XML Schema schema for XML Schema: Structures, a Glossary (normative) * (§C) [not yet written] and References (normative) * (§D). Non-normative appendixes include a Sample Schema (non-normative) (§F) and Acknowledgments (non-normative) (§E).

1.1 Documentation Conventions

This Working Draft document was produced using an [XML] DTD and an [XSLT] stylesheet.

The following highlighting is used to present technical material in this document:

[Definition:]  A term is something we use a lot.

Sample Abstract Syntax Production
left   ::=   right1 right2
Example
A non-normative example illustrating use of the schema language, or a related instance.
<schema name="http://www.muzmo.com/XMLSchema/1.0/mySchema" >
And an explanation of the example.

The following highlighting is used for non-normative commentary in this document:

Issue (dummy): A recorded issue.

Ed. Note: Notes from the editors to themselves or the Working Gorup.

NOTE: General comments directed to all readers.

1.2 Purpose

The purpose of XML Schema: Structures is to provide an inventory of XML markup constructs with which to write schemas.

The purpose of an XML Schema: Structures schema is to define and describe a class of XML documents by using these constructs to constrain and document the meaning, usage and relationships of their constituent parts: datatypes, elements and their content, attributes and their values. Schema constructs may also provide for the specification of additional information such as default values. Schemas are intended to document their own meaning, usage, and function through a common documentation vocabulary. Thus, XML Schema: Structures can be used to define, describe and catalogue XML vocabularies for classes of XML documents.

Any application that consumes well-formed XML can use the XML Schema: Structures formalism to express syntactic, structural and value constraints applicable to its document instances. The XML Schema: Structures formalism will allow a useful level of constraint checking to be described and validated for a wide spectrum of XML applications. However, the language defined by this specification does not attempt to provide all the facilities that might be needed by any application. Some applications may require constraint capabilities not expressible in this language, and so may need to perform their own additional validations.

1.3 Relationship To Other Work

The definition of XML Schema: Structures is a part of the W3C XML Activity. It is in various ways related to other ongoing parts of that Activity and other W3C WGs

XML Schema: Datatypes
XML Schema: Structures has a dependency on the mechanisms for defining simple types provided in its companion [XML Schemas: Datatypes], published simultaneously with this document. Together these two documents constitute the XML Schema Recommendation.
Document Object Model
XML Schema: Structures has not yet identified requirements or dependencies.
HTML
XML Schema: Structures has been requested to support modularization of (X)HTML.
Internationalization Working Group
See http://www.w3.org/XML/Group/1999/03/xml-schema-i18n-notes (W3C Member only)
RDF Schema
XML Schema: Structures has not yet documented requirements or dependencies. See [Cambridge Communiqué] for a clarification of the relationship between the two, which includes requirements arising from web architecture considerations.
WAI
XML Schema: Structures has a requirement to support accessibility.
XML Information Set
XML Schema: Structures has significant dependencies on [XML-Infoset].

XML Schema: Structures defines its own Information Set Contributions which are compatible with [XML-Infoset] although not defined as such therein.
XML Linking WG
Unique, key and key reference constraints (§3.7) uses XPath expressions, as defined in [XPath].
XML Syntax
XML Schema: Structures must interoperate with XML 1.0 and subsequent revisions.
XSL WG
The XSL Working Group has requested XML Schema: Structures to support dimensions and aggregate datatypes: not discharged in this WD.

1.4 Terminology

The terminology used to describe XML Schema: Structures is defined in the body of this specification. The terms defined in the following list are used in building those definitions and in describing the actions of XML Schema: Structures processors:

[Definition:]  may
Conforming documents and processors are permitted to but need not behave as described.
[Definition:]  must
Conforming documents and processors are required to behave as described; otherwise they are in error.
[Definition:]  error
A violation of the rules of this specification; results are undefined. Conforming software may detect and report an error and may recover from it.
[Definition:]  fatal error
An error which a conforming processor must detect and report to the application.
[Definition:]  match
(Of strings or names:) Two strings or names being compared must be character for character the same.
[Definition:]  identical
(Of URIs) identical, according to the rules for identity in [XML-Namespaces].

2 Conceptual Framework

This specification uses a number of terms that are common to many of the fields of endeavor that have influenced the development of XML Schema. Unfortunately, it is often the case that these terms do not have the same definitions in all of those fields. This section attempts to provide definitions of terms as they are used to describe the conceptual framework, and the remainder of the specification.

2.1 Kinds of XML Documents

Since XML schemas are themselves specified as XML documents or elements within documents, it is useful to clarify the relationships between certain kinds of XML documents and elements:

[Definition:]  Instance
An XML element information item which conforms to some schema. See [XML-Infoset] for a discussion of information items: in brief, [Definition:]  an element information item is the component of an infoset which corresponds to an element. From it other information items are accessible, including attributes, namespace declarations and content. See Layer 3: Web-interoperability (§4.3) and Schema Validity * (§6.1) for the means by which an instance identifies the schema(s) to which it conforms. Note we will often speak loosely about an (XML) instance document, but this is just shorthand for the element information item associated with the document element of an XML document. Similarly, we will often speak of elements when we mean element information item.
[Definition:]  XML Schema
An XML element information item which, along with its descendants, satisfies all the Constraints on Schemas in this specification. An XML Schema establishes a set of rules for constraining the structure and articulating the information set of XML document instances.

Note that it is possible to specify a schema to which schemas themselves must conform, and this is given in (normative) Schema for Schemas (§A). An XML 1.0 DTD to which schemas must conform is also provided in (normative) DTD for Schemas (§B).

2.2 On schemas, constraints and contributions

The [XML] specification describes two kinds of constraints on XML documents: well-formedness and validity constraints. Informally, the well-formedness constraints are those imposed by the definition of XML itself (such as the rules for the use of the < and > characters and the rules for proper nesting of elements), while validity constraints are the further constraints on document structure provided by a particular DTD.

Three kinds of normative statements about the impact of XML Schema: Structures components on instances are distinguished in this specification:

[Definition:]  Constraint on Schemas
Constraints on the form and content of schemas themselves, above and beyond those expressed in (normative) Schema for Schemas (§A);
[Definition:]  Schema-Validity Constraint
Constraints on the form and content of instances, which the instances must satisfy to be schema-valid;
[Definition:]  Schema Information Set Contribution
Augmentations to post-validation information sets which follow as a consequence of schema-validation.
NOTE: Schema Information Set Contributions are not as new as might at first appear: XML 1.0 validation augments the XML 1.0 information set in similar ways, e.g. by providing values for attributes not present in instances, and by implicitly exploiting type information for normalization or access, e.g. consider the effect of NMTOKENS on attribute whitespace, and the semantics of ID and IDREF. By including Schema Information Set Contributions, we are trying to make explicit something XML 1.0 left implicit.

XML Schema: Structures not only reconstructs the DTD constraints of XML 1.0 using XML instance syntax, it also adds the ability to define new kinds of constraints. For example, although the author of an XML 1.0 DTD may declare an element type as containing character data, elements, or mixed content, there is no mechanism with which to constrain the contents of elements to only character data of a particular form, such as only numeral sequences representing integers in a specified range.

This specification supports the expression of just such constraints by including in the mechanism for the declaration of elements the option of specifying that its contents must consist of a valid string expression of a particular datatype. A number of other mechanisms are added which improve the expressive power, usability and maintainability of schemas as a means to defining the structure of XML documents.

2.3 Schemas, Types and Elements

The purpose of a schema is to identify a set of components for use in XML documents and to provide the rules for their correct combination.

The schema language itself defines an XML form for itself in terms of elements and attributes. We will describe these, and show how they are used. But first, a quick example of an XML document.

Example
<?xml version='1.0'?>
<PurchaseOrder orderDate="1999-05-20" xmlns="http://www.myco.com/MYPO">
    <shipTo type="US">
        <name>Alice Smith</name>
        <street>123 Maple Street</street>
        <city>Mill Valley</city>
        <state>CA</state>
        <zip>90952</zip>
    </shipTo>
    <shipDate>1999-05-25</shipDate>
    <comment>Get these things to me in a hurry, my lawn is going wild!</comment>
    <Items>
        <Item pno="333-333">
            <productName>Lawnmower, model BUZZ-1</productName>
            <quantity>1</quantity>
            <price>148.95</price>
            <comment>Please confirm this is the electric model</comment>
        </Item>
        <Item pno="444-444">
            <productName>Baby Monitor, model SNOOZE-2</productName>
            <quantity>1</quantity>
            <price>39.98</price>
        </Item>
    </Items>
</PurchaseOrder>

The purchase order consists of a main element with several subordinate elements. Most of the subelements have simple atomic types such as string or date, drawn from the repertoire of built-in simple types defined in [XML Schemas: Datatypes], but some are complex. We use the type element when declaring elements which allow elements in their content and/or may carry attributes. For example, we can define a type called Address as follows:

Example
<type name="Address" >
    <element name="name"   type="string" />
    <element name="street" type="string" />
    <element name="city"   type="string" />
    <element name="state"  type="string" />
    <element name="zip"    type="integer" />
    <attribute name="type" type="string" />
</type>
The consequence of this definition is that an element whose type is declared to be Address must consist of five elements and may have one attribute. Though each has a distinct name, four of the elements and the attribute will simply contain a string in a document instance while one will contain an integer.

If we're going to use the same element in a number of places, we can declare it once and refer to it by name elsewhere:

Example
<element name="comment" type="string" />
This declaration restricts the comment element to text content and no attributes.

We can define a PurchaseOrderType for our PurchaseOrder element, referring to the definitions of Address and comment as above, as:

Example
<type name="PurchaseOrderType">
    <element name="shipTo"    type="po:Address" />
    <element name="shipDate"  type="date" />
    <element ref="po:comment" minOccurs="0" />
    <element name="Items"     type="po:Items" />
    <attribute name="orderDate" type="date" />
</type>
The shipDate element daughter of PurchaseOrderType is declared above as having a simple type, as in the Address example above. The comment daughter is declared by reference to a global element declaration. Since this definition is in the namespace being defined, and apparently the default namespace is being used for the schema elements themselves (e.g. element, attribute), we use a prefix (po) on this reference which would have to be declared with the same URI as the target namespace URI for the containing schema. Similarly, the shipTo and Items daughters are declared as having complex types which must be defined elsewhere in the current schema. The comment daughter and the orderDate attribute are optional, the others are obligatory.
Issue (type-decl-syntax): Further integration of the concrete syntax for type definitions is desireable, e.g. by using 'type' for both simple and complex types, but the details of a consistent and clear way to do this have not yet been agreed.

Since an element declaration's type can identify either a simple or a complex type, and there are separate symbol spaces for these two, the possibility of ambiguity arises. This is resolved in favour of the complex type, e.g. even if a simple type called Address existed (either builtin or user-defined), the above declaration for shipTo would refer to the user-defined complex type of that name.

Issue (note-two-sses): The separation of the simple and complex type name symbol spaces is primarily motivated by the decision to allow unqualified reference to the ab initio and built-in simple types. Should this decision be reversed, as was suggested in the report of the simplification Task Force, then the unification of the two symbol spaces could proceed with minimal negative impact. The potential for error which arises from unexpected shadowing of an old simple type by a new complex type would be removed.

[Definition:]  A definition creates a new type; [Definition:]  a declaration enables the appearance in a document instance of an element or attribute with a specific name and type. In the schema, we see both the definition of several types, and also several elements and attributes declared as usages of these types. For example, Address is defined to be a type, while within the definition of Address we see five declarations of elements and one attribute declaration. These declarations are not themselves types, but rather an association between a name and constraints which govern the appearance of that name in documents governed by the containing schema.

In the case of attribute declarations, the constraints are on the allowed value, always by reference to a simple type:

Example
<attribute name="orderDate" type="date" />

In the case of element declarations, the constraints are on the allowed content and attributes, by reference to a complex or a simple type (in which case no attributes are allowed):

Example
<element name="shipTo" type="po:Address" />
<element name="comment" type="string" />
Because Address is defined in the schema to have certain elements as its content and to allow a certain attribute, any shipTo element appearing in an instance must include those elements and may have that attribute, while any comment element may not have any attributes, but any text content.

As well as naming a type in an attribute or element declaration, we can embed the type definition immediately within the element declaration:

Example
<type name="Items">
 <element name="Item" minOccurs="0" maxOccurs="*">
  <type>
   <element name="productName" type="string" />
   <element name="quantity">
    <datatype source="integer">
     <minExclusive value="0"/>
    </datatype>
   </element>
   <element name="price" type="decimal" />
   <element ref="po:comment" minOccurs="0" />
   <attribute name="pno" type="string"/>
  </type>
 </element>
</type>
Here not only is the type of the Item element given in line, but also the simple type of its quantity daughter (the built-in simple type integer) is qualified inline by adding a subrange constraint.

Taken together the examples above constitute a complete schema for the initial PurchaseOrder example instance. They are drawn together in a single complete schema in Sample Schema (non-normative) (§F).

2.4 Schemas and their component parts

[Definition:]  Schemas are composed of: schema components: a set of type definitions, attribute group definitions, model group definitions, element declarations, and attribute declarations. Note that it is the abstract idea of a component we are talking about here, along with its name: an XML element such as <element> is a standardized representation for a component, not the component itself.

The next chapter Schema Definitions and Declarations (§3) sets out the XML Schema: Structures approach to schemas, with formal definitions of their component parts and presentations of standardized representations for each of them. Here we informally summarize the key constructs used in defining schemas. A 'Yes' in the 'Name appears in instances?' column indicates that the name will appear in instances -- other names are for schema use only.

XML Schema: Structures Feature Purpose Named? Name appears in instances?
The Schema (§3.1) A wrapper element containing all the definitions and declarations comprising a schema. Yes No
Simple Type Definition (§3.4.1) A simple atomic type (content constraint), such as 'integer', that applies to character data in an instance document, whether it appears as an attribute value or the contents of an element. The mechanisms for defining simple types are set out elsewhere, in XML Schemas: Datatypes. Yes No
Complex Type Definition (§3.4.2) A complete set of constraints for elements in instance documents, applying to both contents and attributes. Yes No
Element Declaration (§3.4.9) An association between a name for an element and a type. An element declaration for 'A' is comparable to a DTD declaration <!ELEMENT A .....>. Yes (local or global) Yes
Attribute Declaration (§3.4.3) An association between a name for an attribute and a simple type. The association is local to its surrounding type. Yes (local) Yes
Content type Either a simple type or a content model. A content type applies to the contents of elements in an instance document (but not their attribute values). It provides a unifying abstraction for the constraints which apply to the contents of elements, but introduces no additional features. No No
Element Content Model (§3.4.5) A constraint that applies to the contents of elements in an instance document. May include specifications of grouping and sequencing. No No
Attribute Group Definition (§3.4.4) An association between a name and a reusable collection of attribute declarations. Yes No
Deriving Type Definitions (§3.6) One type may be defined as based on another type, acquiring content type and/or attributes therefrom. Yes No
References to schema components across namespaces (§4.2.2) Integrates definitions and/or declarations from elsewhere into the schema being defined, as if they had been defined locally. No No
Unique, key and key reference constraints (§3.7) Provide more powerful uniqueness and intra-document reference mechanisms Yes No

2.5 Names and Symbol Spaces

As indicated in the third column of the tables above, most of the components listed have names, which provide for references within the schema, and sometimes from one schema to another. For example, an attribute declaration can refer to a named type, such as 'integer'. A content model can refer to an element, and so on.

If all such names were assigned from the same 'pool', then it would be impossible to have e.g. a type named 'integer' and an element with the name 'integer' in the same schema. [Definition:]  Accordingly we introduce the idea of a symbol space (avoiding 'name space' to avoid confusion with the term defined in [XML-Namespaces]). A symbol space is similar to the non-normative concept of namespace partition introduced in [XML-Namespaces].

There is a single distinct symbol space within a given schema for each of the abstractions named above other than 'Attribute' and 'element': within a given symbol space, names are unique, but the same name may appear in more than one symbol space without conflict. In particular note that the same name can refer to both a type and an element, without conflict or necessary relation between the two.

Attributes and local element declarations are special, in that every type defines its own attribute symbol space and local element symbol space, which are distinct from each other. In addition, top-level elements (whose declarations are not contained within a type definition) reside in their own symbol space.

2.6 Referencing Schema Components

The names of schema components such as type definitions and element declarations are not of type ID, as explained above: they are not unique within a schema, just within a symbol space. This means that simple fragment identifiers will not work to reference schema components.

In the long run we expect to provide some mechanism suitable for referencing the semantic components of schemas as such. In the mean time, we observe that [XPointer] provides a mechanism which maps well onto our notion of symbol spaces. An fragment identifier of the form #xpointer(schema/element[@name="person"]) will uniquely identify the element declaration with name person, and similar fragment identifiers can obviously be constructed for the other top-level symbol spaces.

2.7 Association of components with a target namespace

Every element and attribute declaration is associated with a target namespace URI, or with no namespace. More specifically, each symbol space is associated directly (in the case of global declarations) or indirectly (in the case of local declarations) with a target namespace or with no namespace. So, the name of each global declaration is effectively qualified by the target namespace in which it is defined. Locally scoped element and attribute declarations are named in a symbol space defined by their containing type definition.

Global element and attribute declarations are used to validate instance document constructs in the namespace identified by the URI of the target namespace for the corresponding declaration. Declarations with a null target namespace validate non-namespace qualified instance document constructs.

2.7.1 Association of definitions with a target namespace

The XML namespaces recommendation discusses only instance document syntax for elements and attributes; it therefore provides no direct framework for managing the names of types, attribute groups, and other facilities provided by XML schemas. Nevertheless, we apply the target namespace facility uniformly to all schema components. Specifically, the target namespace qualifies the symbol space for definitions as well as for declarations.

2.7.2 Providing a target namespace for definitions and declarations

The above discussion requires that each global definition and declaration be associated with a target namespace. The standard XML format for schema definitions provides a "targetNamespace" attribute for the <schema> element.

    <schema targetNamespace="someNSURI">
      ...every global declaration & def'n here...
      ...is in targetNamespace
    </schema>

If specified, this supplies the same target namespace for all the definitions and declarations contained within that schema element. If absent, it indicates that all the definitions and declarations have a null target namespace.

2.8 Abstract and Concrete Syntax

XML Schema: Structures is presented here primarily in the form of an [Definition:]   abstract syntax, which provides a formal specification of the information provided for each declaration and definition in the schema language. The abstract syntax is presented using a simplified BNF. Defined terms are to the left. Their components are to the right, with a small amount of meta-syntax: ()s for grouping, | to separate alternatives, ? for optionality, * and + for iteration. Terms in italics are primitives, not expanded here, either because they are defined elsewhere (e.g. URI, defined by [URI]) or because they can only be grounded once a concrete syntax is decided on (e.g. choice).

An abstract syntax production prefixed with a number in brackets (e.g. [3]) is normative; other abstract syntax is either for purposes of explanation, or is a duplicate (for convenience) of a normative definition to be found elsewhere.

The abstract syntax illustrates the expressive power of the language, and the relationships among its component parts. The abstract syntax can be used to evaluate the expressive power of XML Schema: Structures, but not its look and feel. In particular, please note that neither ordering within or between productions or choice of names is significant, and that any particular concrete syntax is not constrained by these.

The [Definition:]  concrete syntax of XML Schema: Structures, the exact element and attribute names used in a schema, are a key feature of its proposed design. The concrete syntax is the form in which the schema language is used by schema authors. Though its elements and attributes are often different from the terms of the abstract syntax BNF, the features and expressive power of the two are congruent. The concrete syntax profoundly affects the convenience and usability of the schema language.

We include a preliminary concrete syntax in this draft, via examples, paradigms and in (normative) Schema for Schemas (§A) and (normative) DTD for Schemas (§B). Unlike the previous version, in which the intention was to stay quite close to the abstract syntax, in this version we have begun to take convenience and clarity into account.

3 Schema Definitions and Declarations

Ed. Note: possible changes to the definition of what of schema is, to reflect a our discussion of layering.

The principal purpose of XML Schema: Structures is to provide a means for defining schemas that constrain the contents of instances and augment the information sets thereof.

3.1 The Schema

A schema contains some preamble information and a set of definitions and declarations.

Schema top level
[1]   schema   ::=   preamble dds*
[2]   dds   ::=   annotation | datatypeDefn | typeDefn | elementDecl | attributeGroupDefn | notationDecl | include| import | generalConstraint
[3]   preamble   ::=   xmlSchemaRef targetNamespace schemaVersion finalDefault? exactDefault?
[4]   xmlSchemaRef   ::=   URI
[5]   targetNamespace   ::=   URI
[6]   schemaVersion   ::=   string-value
[7]   finalDefault   ::=   extension? restriction?
[8]   exactDefault   ::=   extension? restriction? equivClass?

preamble consists of an xmlSchemaRef specifying the URI for XML Schema: Structures; the targetNamespace specifying the URI of the namespace which this schema is about; and a schemaVersion specification for private version documentation purposes and version management.

finalDefault and exactDefault provide defaults for final and exact respectively for type definitions and element declarations. The default for these properties is empty in both cases.

See Schema Access and Composition (§4) for discussion of schemas, instances and namespaces, and also for import and include.

Example
<!DOCTYPE schema
          PUBLIC '-//W3C//DTD XML Schema Version 1.0//EN'
          SYSTEM 'http://www.w3.org/TR/1999/WD-xmlschema-1-19991217/structures.dtd' >

<schema targetNS="http://purl.org/metadata/dublin_core"
        version="M.n"
        xmlns="http://www.w3.org/1999/XMLSchema">

  ...

</schema>
Note that the abstract syntax xmlSchemaRef is realised via a default namespace declaration in the concrete syntax.

Although the schema above is a complete XML document, schema need not be the document element, but can appear within other documents. Indeed there is no requirement that a schema be derived from a (text) document at all: it could be built 'by hand' via e.g. a DOM-conformant API.

The schema's declarations and definitions, discussed in detail in Schema Definitions and Declarations (§3), provide for the creation of new schema components:

Summary of Definitions and Declarations
datatypeDefn   ::=   NCName datatypeSpec
typeDefn   ::=   NCName typeSpec
elementDecl   ::=   NCName elementSpec
attribute   ::=   NCName attributeSpec
attributeGroupDefn   ::=   NCName attributesSpec
notationDecl   ::=   NCName notationSpec
Example
The following illustrates the basic model for declaring or defining all XML Schema: Structures components:
 <datatype name="myDatatype">
  ...
 </datatype>

 <type name="myType">
  ...
 </type>

 <element name="myElement">
  ...
 </element>

 <attributeGroup name="myAttrGroup">
  ...
 </attributeGroup>

 <group name="myModelGroup">
  ...
 </group>

 <notation name="myNotation" ... />

</schema>
When creating a component, we establish an association between its name and the specification for that component. Each new component therefore creates a new entry in the symbol space for that kind of component.

Ed. Note: make sure that discussion of targetNamespace is up-to-date.

The Unique Definition (§6.2.1) Constraint on Schemas obtains.

Issue (no-evolution): This draft does not deal with the requirement "for addressing the evolution of schemata" (see [XML Schema Requirements]).

3.2 The Document and its Root

NOTE: We have not so far seen any need to reconstruct the XML 1.0 notion of root. For the connection from document instances to schemas, see Layer 3: Web-interoperability (§4.3) and Schema Validity * (§6.1).

3.3 References to Schema Constructs

Uniform means are provided for reference to a broad variety of schema constructs, both within a single schema and to features imported (References to schema components across namespaces (§4.2.2)) from external schemas. The name used to reference any component of XML Schema: Structures from within a schema consists of a QName. In a few cases, some elaboration may be added to a reference: this is made clear as the individual reference forms are introduced below.

Example: Component Names and References
datatypeRef   ::=   datatypeName
datatypeName   ::=   QName
typeRef   ::=   typeName
typeName   ::=   QName
elementRef   ::=   elementName
elementName   ::=   QName
attributeGroupRef   ::=   attributeGroupName
attributeGroupName   ::=   QName
modelGroupRef   ::=   modelGroupName
modelGroupName   ::=   QName
notationRef   ::=   notationName
notationName   ::=   QName

The abstract syntax above characterizes the reference mechanisms used in this specification.

Example
<element name="elem1" type="Address"/>

<element name="elem2" type="XHTML:BLOCKQUOTE"/>

<attribute name="attr1"
              type="xsl:quantity"/>
The first of these is a local reference, the other two refer to schemas elsewhere and assume that the prefixes used have been declared and their namespaces declared for import. See References to schema components across namespaces (§4.2.2) for a discussion of importing.

The Consistent Import (§6.2.2) Constraint on Schemas obtains.

The identify definition wrt schema-validity obtains.

The Preorder Priority for Included Definitions (§6.2.6) Constraint on Schemas also obtains.

3.4 Types, Elements and Attributes

Like XML 1.0 DTDs, XML Schema: Structures provides facilities for constraining the contents of elements and the values of attributes, and for augmenting the information set of instances, e.g. with defaulted values and type information. [Definition:]  We call a set of SCs intended for use in this way a type definition.

[Definition:]  We refer hereafter to the combination of schema constraints and information set contributions with the abbreviation SC. Compared to DTDs, XML Schema: Structures provides for a richer set of SCs, and improved capabilities for sharing SCs across sets of elements and attributes.

3.4.1 Simple Type Definition

We start with the simple types whose expression in XML documents consists entirely of character data.

Simple Types
[9]   simpleTypeDefn   ::=   NCName simpleTypeSpec
[10]   simpleTypeSpec   ::=   simpleTypeRef facet* final? abstract?
[11]   facet   ::=   is defined by XML Schemas: Datatypes. It might be a range restriction, min/max constraint, etc.
[12]   simpleTypeRef   ::=   simpleTypeName
[13]   simpleTypeName   ::=   QName

XML Schema: Structures incorporates the simple type specification mechanisms defined by [XML Schemas: Datatypes] in order to express SCs on attribute values and the contents of elements consisting entirely of character data.

The production for facet above serves to indicate where this chapter connects with XML Schemas: Datatypes. The concrete syntax displayed below is copied from [XML Schemas: Datatypes]. All facets are optional and may appear in any order within the datatype element. The simpleTypeRef in the simpleTypeSpec identifies the simple type on which the one being defined is based: infinite regress is avoided because XML Schemas: Datatypes provides a set of built-in ab initio simple types.

The other productions provide for using simple types once they have been defined, see below under typeDefn and attribute.

As explained in References to Schema Constructs (§3.3), the use of QName allows for the referenced definition to be located in some other schema.

An abstract type cannot itself be used as the type of an attribute or element.

A simple type definition can rule itself out as the source of type derivations, by declaring itself final.

Example
<datatype name="posInt" source="integer"/>
 <minExclusive value="0"/>
</datatype>

<attribute name="foo" type="posInt"/>

<attribute name="baz" type="integer"/>

<attribute name="fontSize" type="xsl:quantity"
           fixed="12pt"/>
The first attribute example references the definition above it. The second references a datatype pre-defined by XML Schemas: Datatypes. The third references a datatype in an (imaginary) XSL schema and fixes its value.
NOTE: See previous note on the type definition issue.

The satisfy-dt definition wrt schema-validity obtains.

The Datatype Info (§6.2.3.1) Schema Information Set Contribution obtains.

3.4.2 Complex Type Definition

We now move on to [Definition:]  the complex types whose expression in XML documents consists of elements with attributes and/or element content.

Types
[14]   complexTypeDefn   ::=   NCName complexTypeSpec
[15]   complexTypeSpec   ::=   contentType attributesSpec? final exact abstract?
[16]   attributesSpec   ::=   ( attribute | attributeGroupRef )* attrWildcard?
[contentType   ::=   simpleTypeRef | contentModel]
[17]   complexTypeRef   ::=   complexTypeName
[18]   complexTypeName   ::=   QName

The complexTypeDefn production and its descendants provide for all the SCs which constitute a complex type definition; the last two productions provide for reference to complex types once defined. But note that the name of a type is not ipso facto the name of elements whose appearance in instances will be associated with the SCs which constitute that type. The connection between an element name and a type is made by an elementDecl, see below.

Alongside attributesSpec for permitted attributes, SCs for contents are specified by a contentType: for elements which may contain only character data, this is a simple type (via a simpleTypeRef) or, for other kinds of elements, a contentModel. An abstract type may not be used as the type in an elementDecl. (See Wildcards (§3.5) for a discussion of attrWildcard. See Deriving Type Definitions (§3.6) for the full details on contentType, of which the above is only a summary, as well as final, exact and more on abstract.)

Example
<type name="length1" type="decimal"/>
 <restrictions>
  <minInclusive value="0"/>
 </restrictions>
 <attribute name="unit" type="NMTOKEN"/>
</type>

<element name="width" type="length1"/>

  <width unit="cm">2.54</width>

<type name="length2">
 <element name="size">
  <datatype source="decimal">
   <minInclusive value="0"/>
  </datatype>
 </element>
 <element name="unit" type="NMTOKEN"/>
</type>

<element name="depth" type="length2"/>

  <depth>
   <size>2.54</size><unit>cm</unit>
  </depth>
Two approaches to defining a type for length: one with character data content constrained by a qualified reference to a built-in datatype, and one attribute, the other using two elements.

Note that both the datatypeRef and the typeRef options in the abstract syntax are realised by the source attribute on the type element. source must refer to a simple type if content is textonly. The contents of the restrictions element will be quite different in the two cases, and if the source refers to a simple type, no content model is appropriate, so none of element, group or any are allowed. The values other than textonly for content express choices recorded in the abstract syntax in the contentModel and richModel productions below.

Careful consideration of the above abstract and concrete syntax reveal that a type need consist of no more than a name, i.e. that <type name="anything"/> is allowed. See the discussion of the ur-type in Deriving Type Definitions (§3.6) for what such a type means.

NOTE: See previous note on the type definition issue.

The AttrGroup Unique (§6.2.3.2) Constraint on Schemas obtains.

The AttrGroup Identified (§6.2.3.2) Constraint on Schemas obtains.

The attr-decl-set definition wrt schema-validity obtains.

The attr-fullname definition wrt schema-validity obtains.

The Attribute Locally Unique (§6.2.3.2) Constraint on Schemas obtains.

The satisfy-as definition wrt schema-validity obtains.

The Type Info (§6.2.3.2) Schema Information Set Contribution obtains.

3.4.3 Attribute Declaration

Attribute declarations associate a name (which will appear as an attribute in start tags in instances) with SCs for the presence and value thereof by referring to a (possibly restricted) simple type. These SCs in turn will be part of the SCs of one or more types. A default or fixed value may be supplied, as well as an indication of whether the attribute is optional or required.

Attributes
[19]   attribute   ::=   NCName attributeSpec
[20]   attributeSpec   ::=    datatypeSpec occurs valueConstraint?
[21]   valueConstraint   ::=   default | fixed
datatypeSpec   ::=   datatypeRef facet*
occurs   ::=   minOccurs maxOccurs
valueConstraint   ::=   default | fixed
datatypeName   ::=   QName
NOTE: A number of productions are repeated here for easy reference.

Attribute declarations provide for:

  • Requiring/preventing the appearance of attributes (the need to prevent an attribute from appearing, by giving a value of 0 to maxOccurs, will be clarified in Deriving Type Definitions (§3.6);
  • Constraining attribute values to express a datatype;
  • Providing default or fixed values for an attribute.
Example
<attribute name="myAttribute"/>

<attribute name="yetAnotherAttribute" type="integer" minOccurs="1"/>

<attribute name="anotherAttribute" default="42">
 <datatype source="integer">
  <minExclusive value="0"/>
 </datatype>
</attribute>

<attribute name="stillAnotherAttribute" type="string" fixed="Hello world!"/>
Four attributes are declared: one with no explicit SCs at all; two declared by reference to the built-in simple datatype integer, one required to be present in instances and one with a default and a subrange qualification; and one with a fixed value.

The type attribute is used when the attribute can use a built-in or pre-declared datatype, i.e. if no facets are part of its datatypeSpec. Otherwise an anonymous datatype is used.

Wherever attribute declarations are used, the surrounding type definition provides its own symbol space for attribute names. E.g. an attribute named title within one type need not have the same datatypeRef as one declared within another type.

The attr-satisfy definition wrt schema-validity obtains.

The default when no datatypeRef is provided is the ur-type, which imposes no constraints at all.

The satisfy-attrs definition wrt schema-validity obtains.

The Attribute Value Default (§6.2.3.3) Schema Information Set Contribution obtains.

Issue (namespace-declare): We've got a problem with namespace declarations: they're not attributes at the infoset level, so they can appear without compromising validity, except if there is a fixed or required declaration, and defaults should have the apparently desired effect. I.e., if a schema declares an attribute whose name is xmlns with a default or fixed value, does it change the infoset? Or if we allow QNames as such to be declared, xmlns:foo.

3.4.4 Attribute Group Definition

A schema can name a group of attributes so that they may be incorporated as a whole into type definitions:

Attribute groups
[22]   attributeGroupDefn   ::=   NCName attributesSpec
attributesSpec   ::=   ( attribute | attributeGroupRef )* attrWildcard?
[23]   attributeGroupRef   ::=   attributeGroupName
[24]   attributeGroupName   ::=   QName

Attribute group definitions provide a construct to replace some uses of parameter entities. See Wildcards (§3.5) for a discussion of attrWildcard.

Example
<attributeGroup name="myAttrGroup">
    <attribute .../>
    ...
</attributeGroup>

<type name="myelement" content="empty">
    <attributeGroup ref="myAttrGroup"/>
</type>
Define and refer to an attribute group. The effect is as if the attribute declarations in the group were present in the type definition.

The concrete syntax above is the first example of a pattern which will recur: The same element, in this case attributeGroup, serves both to define and to incorporate by reference. In the first case the name attribute is required, in the second the ref attribute is required, and the element must be empty. These two are mutually exclusive, and also conditioned by context: the defining form, with a name, must occur at the top level of a schema, whereas the referring form, with a ref, must occur within a complex type definition or an attribute group definition.

Ed. Note: There needs to be some discussion of what happens in case of name conflict between attrs as a result of an attr group ref.

Issue (global-attrs): Somewhere in Chapter 3, we need to introduce a means for declaring global attributes.

3.4.5 Element Content Model

When content of elements is not constrained by reference to a simple type (Simple Type Definition (§3.4.1)), it can be unconstrained, be constrained to have no content, or allow elements in its content, in which case the form of the content is specified in more detail.

Content model
[25]   contentModel   ::=   unconstrained | empty | richModel

A content model constrains the element content of a type specification: it says nothing about attributes.

Content models do not have names, but appear as a part of the definitions of types, which do have names.

The satisfy-cm definition wrt schema-validity obtains.

3.4.6 Rich Content Models

A content model consisting of an elemModel alone specifies child elements only. If the mixed qualifier is present, text may occur as well as elements. In either case the content model consists of a simple grammar governing the allowed types of child elements and the order in which they must appear.

Rich content model
[26]   richModel   ::=   elemModel mixed?
[27]   elemModel   ::=   allGroup | particle
[28]   particle   ::=   ( element | group | wildcard | modelGroupRef ) occurs
[29]   occurs   ::=   minOccurs maxOccurs
[30]   element   ::=   elementRef | elementDecl
[31]   group   ::=   compositor particle particleSeq
[32]   compositor   ::=   sequence | choice
[33]   particleSeq   ::=   particle particleSeq?
[34]   allGroup   ::=   restrictedParticle restrictedParticleSeq
[35]   restrictedParticle   ::=   element | wildcard
[36]   restrictedParticleSeq   ::=   restrictedParticle restrictedParticleSeq?

[Definition:]  The grammar for element-only content is built on content model particles (particle above): elements, groups and wildcards. A particle provides for some number of occurrences in an instance of a single element (via elementRef or elementDecl), a group of elements (via group) or an indirect specification of any of these (via modelGroupRef).

[Definition:]  We say that a particle permits one or more elements or groups if its minOccurs is 0.[Definition:]  We say that a particle requires one or more elements or groups if its minOccurs is greater than 0.

[Definition:]  A group is two or more particles plus a compositor. The compositor for a group specifies for a given group whether it provides for

These options reconstruct the XML 1.0 , connector, the XML 1.0 | connector, the repeated disjunction of XML 1.0's Mixed production and the SGML & connector respectively. In the first case (sequence) all the elements permitted or required must appear in the order given in the group; in the second case (choice), exactly one of the permitted or required elements must appearin the fourth case (all), all the required elements, which are restricted in this case only to unqualified elements with minOccurs=maxOccurs=1, must appear, but may appear in any order. The all compositor may only appear as the top-level compositor of a content model.

The occurs specification governs how many times the material permitted or required by a particle may occur, but note that the components of a group whose compositor is (implicitly) all may not be qualified, and therefore call for exactly one appearance of the element they identify.

See Element Declaration (§3.4.9) for further discussion and examples of the appearance of elementDecl as one of the two expansions of element above.

For the interpretation of wildcard in this context, see Wildcards (§3.5).

The satisfy-eo definition wrt schema-validity obtains.

The Element Consistency (§6.2.3.6) Constraint on Schemas obtains.

Constraint on Schemas: Unambiguous Content Model
For compatibility, it is an error if a content model is such that there exist element item sequences within which some item can match more than one occurrence of an elementRef, elementDecl or wildcard in the content model.

3.4.7 Mixed Content

A content model which allows mixed content provides for mixing elements with character data in document instances. The same elemModel mechanism is used for specifying the grammar of the allowed elements, with the changes that the implicit top-levl model group has the choice compositor and minOccurs of 0 and maxOccurs of '*', thus ensuring that the default behaviour is the same as that of XML.

Example
<type content="mixed">
 <element ref="name1"/>
 <element ref="name2"/>
 <element ref="name3"/>
</type>
Allows character data mixed with any number of name1, name2 and name3 elements.
Issue (noEmptyReqd): We need to make the elemModel rhs optional, to allow for mixed with no elements specified == our minimum commitment model. This in turn would allow us if we chose to get rid of an explicit empty flag: just specify elementOnly and no model.

We could then get rid of any as well, given other mechanisms for controlled openness we're contemplating.

Note that most of this is actually realised in the current version, with the exception of the observation about empty.

The satisfy-mixed definition wrt schema-validity obtains.

3.4.8 Named Model Group

This reconstructs another common use of parameter entities.

Named model groups
[37]   modelGroupDefn   ::=   NCName modelGroupSpec
[38]   modelGroupSpec   ::=   ( allGroup | group | element | modelGroupRef )
[39]   modelGroupRef   ::=   modelGroupName
[40]   modelGroupName   ::=   QName

Groups defined with the allGroup option may only be referenced from a modelGroupRef which constitutes the only group at the top level of a content model.

Example
<group name="myModelGroup">
 <element ref="myelement"/>
</group>

<element name="myelement">
 <type>
  <group ref="myModelGroup"/>
  <attribute ...>. . .</attribute>
 </type>
</element>

<element name="anotherelement">
 <type>
  <group order="choice">
   <element ref="yetAnotherelement"/>
   <group ref="myModelGroup"/>
  </group>
  <attribute ...>. . .</attribute>
 </type>
</element>
A minimal model group is defined and used by reference, first as the whole content model, then as one alternative in a choice.
Issue (named-model-groups): In its vote on 1999-11-04, the WG agreed that this section was still open for discussion.

3.4.9 Element Declaration

An [Definition:]  element declaration associates an element name with a type, either by reference or by incorporation.

Issue (elt-default): The extension of defaulting to element content is tentative.
Element declaration
[41]   elementDecl   ::=   NCName elementSpec
[42]   elementSpec   ::=   ( typeRef | datatypeRef | complexTypeSpec | simpleTypeSpec ) valueConstraint? generalConstraint* equivClassRef? final exact nullable? abstract?
[43]   equivClassRef   ::=   elementRef
[44]   elementRef   ::=   elementName
[45]   elementName   ::=   QName

An element declaration associates a name with type. This name will appear in tags in instance documents; the type provides SCs on the form of elements tagged with the given name. An element declaration whose elementSpec is an typeSpec is comparable to an <!ELEMENT ...> declaration in an XML 1.0 DTD.

elementSpec not only allows for element declarations to associate a name with a complex type (by reference or inclusion), but also allows the reference or specification to be for a simple type, with the implication that no attributes are allowed in instances and the text-only content will be constrained appropriately.

elementRef provides for top-level element declarations to be referenced by name from content models.

As noted above element names are in a separate symbol space from the symbol spaces for the names of types, so there can (but need not be) a complex type or simple type with the same name as a top-level element.

The elt-fullname definition wrt schema-validity obtains.

An elementDecl may appear both at the top level of a schema and within a modelElt. See above (Rich Content Models (§3.4.6) and Mixed Content (§3.4.7)) for where this is allowed. This declares a locally-scoped association between an element name and a type. As with attribute names, locally-scoped element names reside in symbol spaces local to the type that defines them. Note however that type and datatype names are always top-level names within a schema, even when associated with locally-scoped element names.

The use above of simpleTypeSpce and complexTypeSpec, which have no provision for names, is intentional: nested types are anonymous.

See below at Unique, key and key reference constraints (§3.7) for generalConstraints.

An element declared as nullable may appear in instances with an attribute whose name is null from the XML Schema instance namespace and value true to distinguish a null content from an empty content. It is an error for element information items marked xsi:null="true" to have any content.

Issue (nullRequiresEmpty): Is it a precondition for being nullable that the element's contentType allow no content? If not, then more needs to be said above, if so, this needs to be spelled out.
Example
<element name="myelement" type="mySimpleType"/>

<element name="et0" type="myComplexType"/>

<element name="et1">
 <type>
  <element ref="et0"/>
  . . .
  <attribute ...>. . .</attribute>
 </type>
</element>

<element name="et2">
 <type content="empty">
  <attribute ...>. . .</attribute>
 </type>
</element>
The first two examples above declare elements by reference to a simple and a complex type respectively. The third and fourth use embedded anonymous complex types, the first of which in turn refers to one of the top-level elements in its content model.
<element name="contextOne">
 <type>
  <element name="myLocalelement" type="myFirstType"/>
  <element ref="globalelement"/>
 </type>
</element>

<element name="contextTwo">
 <type>
  <element name="myLocalelement" type="mySecondType"/>
  <element ref="globalelement"/>
 </type>
</element>
Instances of myLocalelement within contextOne will be constrained by myFirstType, while those within contextTwo will be constrained by mySecondType.
NOTE: The possibility that differing attribute declarations and/or content models would apply to elements with the same name in different contexts is an extension beyond the expressive power of a DTD in XML 1.0.

In the concrete syntax above, the type attribute is used to encode both the typeRef or datatypeRef options. In the case where there are both a simple type and a complex type of the referenced name in the relevant schema, the ambiguity is resolved in favour of the complex type.

NOTE: See previous note on the ambiguity issue.

Ed. Note: existing section on element declaration should be updated to cover instance syntax.

The Nested May Not Be Global (§6.2.3.7) Constraint on Schemas obtains.

The satisfy-ed definition wrt schema-validity obtains.

The ind-valid definition wrt schema-validity obtains.

The satisfy-etr definition wrt schema-validity obtains.

3.5 Wildcards

In order to exploit the full potential for extensibility offered by XML plus namespaces, more provision is needed than DTDs allow for targeted flexibility in content models and attribute declarations. At a given point in a content model, in addition to what DTDs provide for we need particles that allow the following:

  1. Any well-formed XML element item: any tag, any namespace, any attributes, any content, as long as it's well-formed;
  2. Any well-formed XML element item, provided it's in some other namespace than the one we're defining a type for;
  3. Any well-formed XML element item, provided it's in a specified namespace;
  4. Any well-formed XML element item, provided it's from the same namespace as the one we're defining a type for.

Of course, by qualifying one of these with a *, we allow for any amount of (localized) flexibility in validation.

Attributes need the same kind of flexibility: a good-citizen schema should probably allow any attributes from the xml: namespace, for instance.

Wildcards
[46]   wildcard   ::=   any | otherNS | allowedNSList | sameNS
[47]   allowedNSList   ::=   myNS? URI*
[48]   attrWildcard   ::=   wildcard

The four alternatives for wildcard correspond to the four kinds of flexibility listed above.

All of the above are subject to the same ambiguity constraints (Unambiguous Content Model (§3.4.6)) as other content model particles: If an instance element could match either an explicit particle and a wildcard, or one of two wildcards, within the content model of a type, that model is in error.

Example
<any/>

<any namespace="##other"/>

<any namespace="http://www.w3.org/1999/Style/Transform/">

<any namespace="##targetNamespace"/>

<anyAttribute namespace="http://www.w3.org/XML/1998/namespace"/>
Concrete examples of the four cases listed above, plus one attribute case.

3.6 Deriving Type Definitions

This section articulates what has only been hinted at above, namely a considerable increase in the power and expressiveness of schema declarations, by explaining what was provided for in the abstract syntax in the previous section, but not explained much if at all at that point: the potential for deriving new type definitions on the basis of old ones. [Definition:]  We call such a new definition a derived type definition, and [Definition:]  the old definition it is derived from the source type definition.

We provide two means for deriving type definitions from other type definitions, each of which implies a partial order over the types defined in a schema: A type definition may either restrict or extend another type definition.

3.6.1 Deriving type definitions by extension

A new type complex type can be defined by adding additional content model particles at the end of the element-only content model of another complex definition and/or by adding attribute declarations to any type definition. Members of a type whose definition is derived in this way, i.e. by extension, will always contain members of their source type within them as prefixes.

Extension
complexTypeSpec   ::=   contentType attributesSpec? final exact abstract?
[49]   contentType   ::=   extension | restriction
[50]   extension   ::=   simpleTypeRef | ( complexTypeRef contentModel )

For the time being, the effective content model of a type definition derived by extension from another complex type is composed by appending its contentModel to that of the source definition. It follows from this that the source definition must be complex and element-only if the contentModel is not empty. If it is empty, there is no constraint on the nature of the source definition, which may be simple or complex (thus the simpleTypeRef above). In either case, attributes may be added.

NOTE: The restriction to appending in the case of content-model extension simplifies application processing in order to cast instances from derived to source type. We may liberalise this in future versions, requiring more complex transformations to effect casting.
Example
<type name="personName">
 <element name="title" minOccurs="0"/>
 <element name="forename" minOccurs="0" maxOccurs="*"/>
 <element name="surname"/>
</type>

<type name="extendedName" source="personName" derivedBy="extension">
 <element name="generation" minOccurs="0"/>
</type>

<element name="addressee" type="extendedName"/>

  <addressee>
   <forename>Albert</forename>
   <forename>Arnold</forename>
   <surname>Gore</surname>
   <generation>Jr</generation>
  </addressee>
A type definition for personal names, and a definition derived by extension which adds a single element; an element declaration referencing the derived definition, and a valid instance thereof.

3.6.2 Deriving type definitions by restriction

A new type can be defined by decreasing the possibilities made available by an existing type definition: narrowing ranges, removing alternatives, etc. Restriction is specified bottom up, via simpleRestrictions or