Copyright ©1999, 2000 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
XML Schema: Structures specifies the XML Schema definition language, which offers facilities for describing the structure and constraining the contents of XML 1.0 documents. The schema language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs). This specification depends on XML Schema Part 2: Datatypes.
This is a public working draft of XML Schema 1.0, issued by the XML Schema Working Group, for review by the public and by members and working groups of the World Wide Web Consortium.
This working draft incorporates most Working Group decisions through 2000-09-19. It has been reviewed by the XML Schema Working Group, and the Working Group has agreed to its publication as a working draft, which includes our proposed resolution of most issues raised during Last Call. The Working Group intends to submit this specification for publication as a Candidate Recommendation very soon, but is issuing this interim public draft as it sets out a number of changes to the XML Representation of XML Schemas, and we wished to make these available as quickly as possible. Readers may find Description of changes (non-normative) (§H) helpful in identifying the major changes since the last Public Working Draft.
Note that this revision incorporates several backwards-incompatible changes to the XML representation of schemas. Accordingly, the XML Schema namespace URI has changed, to http://www.w3.org/2000/10/XMLSchema.
Although comments from the public and other W3C working groups are always welcome, and we encourage readers to review the draft and to send comments to www-xml-schema-comments@w3.org, comments on changes other than those to the concrete syntax may be premature, as there are still some changes pending to the prose of the specifications. An archive of the comments received is available.
Although the Working Group does not anticipate further changes to the functionality described here, this is still a working draft, subject to change. The present version should be implemented only by those interested in providing a check on its design or by those preparing for an implementation of the Candidate Recommendation. The Schema WG will not allow early implementation to constrain its ability to make changes to this specification prior to final release.
During the Candidate Recommendation phase, although feedback based on implementation experience is welcome, there are certain aspects of the design presented herein where the Working Group is particularly interested in feedback. These are designated priority feedback aspects of the design, and identified as such in editorial notes throughout this draft.
A list of current W3C working drafts can be found at http://www.w3.org/TR/. They may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress".
This document sets out the structural part (XML Schema: Structures) of the XML Schema definition language.
Chapter 2 presents a Conceptual Framework (§2) for XML Schemas, including an introduction to the nature of XML Schemas and a formal specification of the XML Schema abstract data model, along with other terminology used throughout this document.
Chapter 3, Schema Component Details (§3), specifies the precise semantics of each component of the abstract model.
Chapter 4, XML Representation of Schemas and Schema Components (§4), presents the XML representation that maps to the abstract model, in the form of a DTD and XML Schema for an XML Schema document type, along with rules and conventions for identifying the components needed for any particular validation.
Chapter 5 presents Schema Component Validity Constraints (§5), which provide detailed constraints on the internal structure of each component of the abstract model.
Chapter 6 presents Schema Access and Composition (§6), including the connection between documents and schemas, the import and inclusion of declarations and definitions and the foundations of schema-validity.
Chapter 7 discusses Validation Processing of schemas and documents (§7), including the overall approach to schema-validation of documents, and responsibilities of schema-aware processors.
The normative appendices include a (normative) Schema for Schemas (§A) for the transfer syntax, a Glossary (normative) * (§B) [not yet written] and References (normative) * (§C).
The non-normative appendices include the (non-normative) DTD for Schemas (§E)
This document is primarily intended as a language definition reference. As such, although it contains a few examples, it is not designed primarily to serve as a motivating introduction to the design and its features, but rather as a careful and fully explicit definition of that design, suitable for guiding implementations. For those in search of a step-by-step introduction to the design, the non-normative [XML Schema: Primer] is a much better starting point than this document.
The purpose of XML Schema: Structures is to define the nature of XML schemas and their component parts, provide an inventory of XML markup constructs with which to represent schemas, and define the application of schemas to XML documents.
The purpose of an XML Schema: Structures schema is to define and describe a class of XML documents by using schema components to constrain and document the meaning, usage and relationships of their constituent parts: datatypes, elements and their content and attributes and their values. Schemas may also provide for the specification of additional document information, such as default values for attributes and elements. Schemas have facilities for self-documentation. Thus, XML Schema: Structures can be used to define, describe and catalogue XML vocabularies for classes of XML documents.
Any application that consumes well-formed XML can use the XML Schema: Structures formalism to express syntactic, structural and value constraints applicable to its document instances. The XML Schema: Structures formalism allows a useful level of constraint checking to be described and validated for a wide spectrum of XML applications. However, the language defined by this specification does not attempt to provide all the facilities that might be needed by any application. Some applications may require constraint capabilities not expressible in this language, and so may need to perform their own additional validations.
The definition of XML Schema: Structures depends on the following specifications:
[URI],
[XML-Infoset],
[XML-Namespaces],
[XPath], and
[XML Schemas: Datatypes]. If the XML Base proposal is adopted before we go to REC, we will
need to account for any changes it makes to the Infoset in the areas of
QName interpretation and value space and
the interpretation of all aspects of schemas involving values identified as
being of type uriReference, including in particular
xsi:schemaLocation, xsi:noNamespaceSchemaLocation and targetNamespace.
The following highlighting and typography is used to present technical material in this document:
Special terms are defined at their point of introduction in the text; hyperlinks connect other uses of the term to the definition. For example, a definition of term might read: [Definition:] A term is something we use a lot. The definition is labeled as such and the term is highlighted typographically. The end of the definition is not specially marked in the displayed or printed text.
Non-normative examples are set off typographically and accompanied by a brief explanation:
Example
<schema targetNamespace="http://www.muzmo.com/XMLSchema/1.0/mySchema" >And an explanation of the example.
References to properties of information items as defined in [XML-Infoset] are notated as links to the relevant section thereof, set off with square brackets, for example [children].
The definition of each kind of schema component consists of a list of its properties and their contents, followed by descriptions of the semantics of the properties:
Schema Component: Example
- {example property}
- Definition of the property.
References to properties of schema components are notated as links to the relevant definition as exemplified above, set off with curly braces, for instance {example property}.
The correspondence between an element information item which is part
of the XML representation of a schema and one or more schema components is presented in a tableau
which illustrates the element information item(s) involved,
followed by a tabulation of the correspondence between properties of the component
and properties of the information item. Where context may determine which of
several different components may arise, several tabulations, one per component,
are given. In the XML representation, bold-face
attribute names (e.g. count below) indicate a required
attribute information item, and the rest are
optional. Where an attribute information item has an enumerated type
definition, the values are shown separated by vertical bars, as for
size below; if there is a default value, it is shown
following a colon. The allowed content of the information item is
shown as a grammar fragment, using the Kleene operators ?,
* and +. The property correspondences are normative, but
the illustration of the XML representation element information items is not.
XML Representation Summary: exampleElement Information Item
Example Schema Component
Property Representation {example property} Description of what the property corresponds to, e.g. the value of the size[attribute]
The following highlighting is used for non-normative commentary in this document:
Issue (dummy): A recorded issue.
Ed. Note: Notes from the editors to themselves or the Working Group, or identification of priority feedback aspects of this draft.
NOTE: General comments directed to all readers.
This chapter gives an overview of XML Schema: Structures at the level of its abstract data model. (Schema Component Details (§3) provides details on this model, and subsequent chapters define a normative representation in XML for the components of the model.) Readers interested primarily in learning to write schema documents may wish to first read [XML Schema: Primer] and then consult XML Representation of Schemas and Schema Components (§4), using the sections below as a guide to the underlying formal structure of the schema language.
An XML Schema consists of components such as type definitions and element declarations. These can be used to assess the validity of well-formed element information items (as defined in [XML-Infoset]), and furthermore may specify augmentations to those items and their descendants. This augmentation makes explicit information which may have been implicit in the original document, such as default values for attributes and elements and the types of element and attribute information items.
The process of schema validation consists of determining whether an element information item satisfies the constraints embodied in the components of an XML Schema, and if so of adding any appropriate augmentations.
This specification builds on [XML] and [XML-Namespaces]. The concepts and definitions used herein regarding XML are framed at the abstract level of information items as defined in [XML-Infoset]. By definition, this use of the infoset provides a priori guarantees of well-formedness (as defined in [XML]) and namespace conformance (as defined in [XML-Namespaces]) for all candidates for schema-validity and for all schema documents.
Just as [XML] and [XML-Namespaces] can be described in terms of information items, XML Schemas can be described in terms of an abstract data model. In defining XML Schemas in terms of an abstract data model, this specification rigorously specifies the information which must be available to a conforming XML Schema processor. The abstract model for schemas is conceptual only, and does not mandate any particular implementation or representation of this information. To facilitate interoperation and sharing of schema information, a normative interchange format for schemas is described in XML Representation of Schemas and Schema Components (§4)
NOTE: We have not so far seen any need to reconstruct the XML 1.0 notion of root. For the connection from document instances to schemas, see Layer 3: Web-interoperability (§6.3) and Errors in Schema Construction and Structure (§7.1).
[Definition:] Schema component is the generic term for the building blocks that comprise the abstract data model of the schema. [Definition:] An XML Schema is a set of schema components. There are 12 kinds of component in all, falling into three groups. The primary components are as follows. They may have names, and (except for some element declarations) may be independently accessed:
The secondary components are as follows. Like the primary components, they may have names and be independently accessed:
Finally, the "helper" components provide small parts of other components; they are not independent of their context and cannot be independently accessed:
During validation, [Definition:] declaration components are associated by (qualified) name to information items being validated.
On the other hand, [Definition:] definition components define internal schema components that can be used in other schema components.
[Definition:] Declarations and definitions may have and be identified by names, which are NCNames as defined by [XML-Namespaces].
[Definition:] Several kinds of component have a target namespace, which is either absent or a namespace URI, also as defined by [XML-Namespaces]. The target namespace serves to identify the namespace within which the association between the component and its name exists. In the case of declarations, this in turn determines the namespace URI of, for example, the element information items it may validate.
NOTE: At the abstract level, there is no requirement that the components of a schema share a target namespace. Any schema for use in schema-validation of documents containing names from more than one namespace will of necessity include components with different target namespaces. This contrasts with the situation at the level of the XML Representation of Schemas and Schema Components (§4), in which each schema document contributes definitions and declarations to a single target namespace.
Schema-validity, defined in detail in Validation Processing of schemas and documents (§7), is a relation between information items and schema components. For example, an attribute information item may be schema-valid with respect to an attribute declaration, a list of element information items may be schema-valid with respect to a content model, and so on. The following sections briefly introduce the kinds of components in the schema abstract data model, other major features of the abstract model, and how they contribute to the overall definition of schema-validity.
The abstract model provides two kinds of type definition component: simple and complex.
[Definition:] This specification uses the phrase type definition in cases where no distinction need be made between simple and complex types.
Type definitions form a hierarchy with a single root. First we describe characteristics of that hierarchy, then provide an introduction to simple and complex type definitions themselves.
[Definition:] Except for a distinguished ur-type definition, every type definition is, by construction, either a restriction or an extension of some other type definition. The graph of these relationships forms a tree known as the Type Definition Hierarchy.
[Definition:] A type definition whose declarations or facets are in a one-to-one relation with those of another specified type definition, with each in turn restricting the possibilities of the one it corresponds to, is said to be a restriction. The specific restrictions might include narrowed ranges or reduced alternatives. Members of a type, A, whose definition is a restriction of the definition of another type, B, are always members of type B as well.
[Definition:] A complex type definition which allows element or attribute content in addition to that allowed by another specified type definition is said to be an extension.
[Definition:] A distinguished ur-type definition is present in each XML Schema, serving as the root of the type definition hierarchy for that schema. The ur-type definition, whose name is anyType, has the unique characteristic that it can function as a complex or a simple type definition, according to context. Specifically, restrictions of the ur-type definition can themselves be either simple or complex type definitions.
[Definition:] A type definition used as the basis for an extension or restriction is known as the base type definition of that definition.
A simple type definition is a set of constraints on strings and information about the values they encode, applicable to the normalized value of an attribute information item or of an element information item with no element children. Informally, it applies to attribute values and text-only content of elements.
Each simple type definition, whether built-in (that is, defined in [XML Schemas: Datatypes]) or user-defined, is a restriction of some particular simple base type definition. For the built-in primitive types, this is the simple version of the ur-type definition, whose name is anySimpleType, which is in turn understood to be a restriction of the ur-type definition. Simple types may also be defined whose members are lists of items themselves constrained by some other simple type definition, or whose membership is the union of the memberships of some other simple type definitions. List and union simple type definitions are also understood as restrictions of the simple ur-type definition.
For details on the composition and schema-validation contributions of simple type definitions, see (non-normative) Simple Type Definition Details (§3.13) and [XML Schemas: Datatypes]. The latter also defines an extensive inventory of pre-defined simple types. See (non-normative) XML Representation of Simple Type Definition Schema Components (§4.3.11) for the XML representation of simple type definitions, and Simple Type Definition Constraints (§5.12) for constraints on simple type definition components as such.
A complex type definition is a set of attribute declarations and a content type, applicable to the [attributes] and [children] of an element information item respectively. The content type may require the [children] to contain neither element nor character information items, to be a string which is schema-valid with respect to particular simple type or to contain a sequence of element information items which is schema-valid with respect to a particular model group, with or without character information items as well.
Each complex type definition is either
or
or
A complex type which extends another does so by having additional content model particles at the end of the other definition's content model, or by having additional attribute declarations, or both.
NOTE: This specification allows only appending, and not other kinds of extensions. This decision simplifies application processing required to cast instances from derived to base type. Future versions may allow more kinds of extension, requiring more complex transformations to effect casting.
See Complex Type Definition Details (§3.4) for the composition and schema-validation contributions of complex type definition schema components, XML Representation of Complex Type Definition Schema Components (§4.3.3) for the XML representation of complex type definitions and Complex Type Definition Constraints (§5.11) for constraints on complex type definition components as such.
There are three kinds of declaration component: element, attribute, and notation. Each described in a section below. Also included is a discussion of element substitution groups, which is a feature provided in conjunction with element declarations.
An element declaration is an association of a name with a type definition, either simple or complex, an (optional) default value and a set of identity-constraint definitions. The association is either global or scoped to a containing complex type definition. A global element declaration with name 'A' is broadly comparable to a pair of DTD declarations as follows, where the associated type definition fills in the ellipsis:
<!ELEMENT A . . .> <!ATTLIST A . . .>
Element declarations contribute to schema-validity as part of model group validation, when their defaults and type components are checked against an element information item with a matching name and namespace, and by triggering identity-constraint definition validation.
See Element Declaration Details (§3.3) for the composition and schema-validation contributions of element declaration schema components, XML Representation of Element Declaration Schema Components (§4.3.2) for the XML representation of element declarations and Element Declaration Constraints (§5.2) for constraints on element declaration components as such.
In XML 1.0, the name and content of an element must correspond exactly to the element type referenced in the corresponding content model.
[Definition:] Through the new mechanism of element substitution groups, XML Schemas provides a more powerful model supporting substitution of one named element for another. Any global element declaration can serve as the defining element, or head, for an element substitution group. Other global element declarations, regardless of target namespace, can be designated as members of the substitution group headed by this element. In a suitably enabled content model, a reference to the head validates not just the head itself, but elements corresponding to any member of the substitution group as well.
All such members must have type definitions which are either the same as the head's type definition or restrictions or extensions of it. Therefore, although the names of elements can vary widely as new namespaces and members of the substitution group are defined, the content of member elements is strictly limited according to the type definition of the substitution group head.
Note that element substitution groups are not represented as separate components. They are specified in the property values for element declarations (see Element Declaration (§2.2.2.1)).
An attribute declaration is an association between a name and a simple type definition, together with occurrence information and (optionally) a default value. The association is either global, or local to its containing complex type definition. Attribute declarations contribute to schema-validity as part of complex type definition validation, when their occurrence, defaults and type components are checked against an attribute information item with a matching name and namespace.
See Attribute Declaration Details (§3.2) for the composition and schema validation contributions of attribute declaration schema components, XML Representation of Attribute Declaration Schema Components (§4.3.1) for the XML representation of attribute declarations and Attribute Declaration Constraints (§5.1) for constraints on attribute declaration components as such.
A notation declaration is an association between a name and an identifier for a
notation. For an attribute information item to be schema-valid with respect to a
NOTATION simple type definition, its value must have been declared
with a notation declaration.
See Notation Declaration Details (§3.11) for the composition and schema validation contributions of notation declaration schema components, XML Representation of Notation Declaration Schema Components (§4.3.9) for the XML representation of notation declarations and Notation Declaration Constraints (§5.8) for constraints on notation declaration components as such.
The model group, particle, and wildcard components contribute to the portion of a complex type definition that controls an element information item's content type.
A model group is a constraint in the form of a grammar fragment that applies to lists of element information items. It consists of a list of particles, i.e. element declarations, wildcards and model groups. There are three varieties of model group:
See Model Group Details (§3.7) for the composition and schema-validation contributions of model group schema components, Complex Type Definition Details (§3.4) for the use of model groups as content models, XML Representation of Model Group Schema Components (§4.3.6) for the XML representation of model groups and Model Group Constraints (§5.7) for constraints on model group components as such.
A particle is a term in the grammar for element content, consisting of either an element declaration, a wildcard or a model group, together with occurrence constraints. Particles contribute to schema-validity as part of complex type validation, when they allow anywhere from zero to many element information items or sequences thereof, depending on their contents and occurrence constraints.
[Definition:] A particle can be used in a complex type definition to express a validity constraint on the [children] of an element information item; such a particle is called a content model.
NOTE: XML Schema: Structures content models are similar to but more expressive than [XML] content models; unlike [XML], XML Schema: Structures applies content models to the validation of both mixed and element-only content.
See Particle Details (§3.8) for the composition and schema-validation contributions of particle schema components, XML Representation of Model Group Schema Components (§4.3.6) for the XML representation of particles and Particle Constraints (§5.10) for constraints on particle components as such.
A wildcard is a special kind of particle which matches element and attribute information items dependent on their namespace URI, independently of their local names.
See Wildcard Details (§3.9) for the composition and schema-validation contributions of wildcard schema components, XML Representation of Wildcard Schema Components (§4.3.7) for the XML representation of wildcards and Wildcard Constraints (§5.5) for constraints on wildcard components as such.
A identity-constraint definition is an association between a name and one of several varieties of identity-constraint related to uniqueness and reference. All the varieties use [XPath] expressions to pick out sets of information items relative to particular target element information items which are unique, or a key, or a valid reference, within a specified scope. An element information item is only schema-valid with respect to an element declaration with identity-constraint definitions if those definitions are all satisfied for all the descendants of that element information item which they pick out.
See Identity-constraint Definition Details (§3.10) for the composition and schema-validation contributions of identity-constraint definition schema components, XML Representation of Identity-constraint Definition Schema Components (§4.3.8) for the XML representation of identity-constraint definitions and Identity-constraint Definition Constraints (§5.3) for constraints on identity-constraint definition components as such.
There are two kinds of convenience definitions available for use in reusing pieces of complex type definitions: model group definitions and attribute group definitions.
A model group definition is an association between a name and a model group, for use in reusing the same model group in several complex type definitions.
See Model Group Definition Details (§3.6) for the composition and schema validation contributions of model group definition schema components, XML Representation of Model Group Definition Schema Components (§4.3.5) for the XML representation of model group definitions and Model Group Definition Constraints (§5.6) for constraints on model group definition components as such.
An attribute group definition is an association between a name and a set of attribute declarations, for use in reusing the same set in several complex type definitions.
See Attribute Group Definition Details (§3.5) for the composition and schema-validation contributions of attribute group definition schema components, XML Representation of Attribute Group Definition Schema Components (§4.3.4) for the XML representation of attribute group definitions and Attribute Group Definition Constraints (§5.4) for constraints on attribute group definition components as such.
An annotation is information for human and/or mechanical consumers. The interpretation of such information is not defined in this specification.
See Annotation Details (§3.12) for the composition and schema-validation contributions of annotation schema components, XML Representation of Annotation Schema Components (§4.3.10) for the XML representation of annotations and Annotation Constraints (§5.9) for constraints on annotation components as such.
The [XML] specification describes two kinds of constraints on XML documents: well-formedness and validity constraints. Informally, the well-formedness constraints are those imposed by the definition of XML itself (such as the rules for the use of the < and > characters and the rules for proper nesting of elements), while validity constraints are the further constraints on document structure provided by a particular DTD.
The preceding section focussed on schema-validity, that is the constraints on information items which schema components supply. In fact however this specification provides four different kinds of normative statements about schema components, their representations in XML and their contribution to the schema-validation of information items:
The definition of the above constraints sometimes involves many clauses, some as alternatives, some as joint requirements. The presentations below number all clauses: clauses at the same level are either clearly identified as alternatives with words such as either and or, or should be understood as joint.
Schema information set
contributions are not new. XML 1.0
validation augments the XML 1.0 information set in similar ways,
for example by
providing values for attributes not present in instances, and by implicitly
exploiting type information for normalization or access.
(As an example of the latter case, consider the
effect of NMTOKENS on attribute whitespace, and the semantics of
ID and IDREF.) By including schema
information set contributions, this specification makes explicit some features
that XML 1.0 left implicit.
This specification describes three levels of conformance for schema aware processors. The first is required of all processors. Support for the other two will depend on the application environments for which the processor is intended.
[Definition:] Minimally conforming processors must completely and correctly implement the Constraints on Schemas, Validity Contributions, and Schema Information Set Contributions contained in this specification.
[Definition:] Processors which accept schemas in the form of XML documents as described in XML Representation of Schemas and Schema Components (§4) are additionally said to provide conformance to the XML Representation of Schemas. Such processors must, when processing schema documents, completely and correctly implement all Schema Representation Constraints in this specification, and must adhere exactly to the specifications in XML Representation of Schemas and Schema Components (§4) for mapping the contents of such documents to schema components for use in validation.
NOTE: By separating the conformance requirements relating to the concrete syntax of XML schema documents, this specification admits processors which validate using schemas stored in optimised binary representations, dynamically created schemas represented as programming language data structures, or implementations in which particular schemas are compiled into executable code such as C or Java. Such processors can be said to be minimally conforming but not necessarily in conformance to the XML Representation of Schemas.
[Definition:] Fully conforming processors are network-enabled processors which support both levels of conformance described above, and which must additionally be capable of accessing schema documents from the World Wide Web according to Representation of Schemas on the World Wide Web (§2.7) and How schema definitions are located on the Web (§6.3.2). .
NOTE: Although this specification provides just these three standard levels of conformance, it is anticipated that other conventions can be established in the future. For example, the World Wide Web Consortium is considering conventions for packaging on the Web a variety of resources relating to individual documents and namespaces. Should such developments lead to new conventions for representing schemas, or for accessing them on the Web, new levels of conformance can be established and named at that time. There is no need to modify or republish this recommendation to define such additional levels of conformance.
See Schema Access and Composition (§6) for a more detailed explanation of the mechanisms supporting these levels of conformance.
As discussed in XML Schema Abstract Data Model (§2.2), most schema components (may) have names. If all such names were assigned from the same "pool", then it would be impossible to have, for example, a simple type definition and an element declaration both with the name "title" in a given target namespace.
This specification therefore introduces the term [Definition:] symbol space to denote a collection of names, each of which is unique with respect to the others. A symbol space is similar to the non-normative concept of namespace partition introduced in [XML-Namespaces]. There is a single distinct symbol space within a given target namespace for each kind of definition and declaration component identified in XML Schema Abstract Data Model (§2.2), except that within a target namespace, simple type definitions and complex type definitions share a symbol space. Within a given symbol space, names are unique, but the same name may appear in more than one symbol space without conflict. For example, the same name can appear in both a type definition and an element declaration, without conflict or necessary relation between the two.
Locally scoped attribute and element declarations are special with regard to symbol spaces. Every complex type definition defines its own local attribute and element declaration symbol spaces, where these symbol spaces are distinct from each other and from any of the other symbol spaces. So, for example, two complex type definitions having the same target namespace can contain a local attribute declaration for the unqualified name "priority", or contain a local element declaration for the name "address", without conflict or necessary relation between the two.
The XML representation of schema components uses a vocabulary
identified by the namespace URI http://www.w3.org/2000/10/XMLSchema.
XML Schema: Structures also defines several attributes for direct use in XML documents. These attributes are in a different namespace,
which has the namespace URI http://www.w3.org/2000/10/XMLSchema-instance.
For brevity, the text and examples in this specification use the prefix
xsi: to stand for this latter namespace; in practice,
any prefix can be used.
The Simple Type Definition (§2.2.1.2) or Complex Type Definition (§2.2.1.3) used to validate an element is usually
determined by reference to the appropriate schema components.
However, when permitted by those components, an element can
explicitly assert its type using the attribute xsi:type.
The value of this attribute is a QName; see QName Interpretation (§4.2) for
the means by which the QName is
associated with a type definition.
XML Schema: Structures introduces a mechanism for signalling that an element's content is
missing, or "null" in the terminology of databases. An
element has null content if it has the attribute xsi:null with
the value true. An element so labelled must be empty, but can
carry attributes if permitted by the corresponding complex type.
The xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes can be used in a document to provide
hints as to the physical location of schema documents which may be used for validation.
See How schema definitions are located on the Web (§6.3.2) for details on the use of these attributes.
On the World Wide Web, schemas are conventionally represented as documents of MIME type "text/xml", conforming to the specifications in XML Representation of Schemas and Schema Components (§4). For more information on the representation and use of schema documents on the World Wide Web see Standards for representation of schemas and retrieval of schema documents on the Web (§6.3.1) and How schema definitions are located on the Web (§6.3.2).
The following sections provide full details on the properties and significance of the schema itself and each kind of schema component. For each property, its range, that is the kinds of values it may have, is defined. This can be understood as defining a schema as a labelled directed graph, where the root is a schema, and every other vertex is a schema component or a literal (string, boolean, number) and every labelled edge a property. The graph is not acyclic: multiple copies of components with the same name in the same symbol space may not exist, so in some cases re-entrant chains of properties must exist. Equality of components for the purposes of this specification is always addressed at the level of names (including target namespaces) within symbol spaces. Any property not identified as optional is required to be present, optional properties which are absent are taken to have absent as their value. Any property identified as a having a set, subset or list value may have an empty value unless this is explicitly ruled out: this is not the same as absent. Any property value identified as superset or subset of some set may be equal to that set, unless a proper superset or subset is explicitly called for. By 'string' in Part 1 of this specification is meant a sequence of ISO 10646 character codes identified as legal XML character codes in [XML].
NOTE: Readers whose primary interest is in the XML representation of schemas may wish to skip this chapter on the first reading, concentrating on XML Representation of Schemas and Schema Components (§4) and [XML Schema: Primer].
Throughout this specification, [Definition:] when we refer to the initial value of some attribute information item, we mean by this the value of the normalized value property of that item. Similarly, when we refer to the initial value of an element information item, we mean the string composed of, in order, the [character code] of each character information item in the [children] of that element information item.
[Definition:] When we refer to the normalized value of an element or attribute information item, we mean an initial value whose whitespace, if any, has been normalized according to the value of the whitespace facet of the simple type definition by which its validity is assessed:
	 (tab), 
 (linefeed) and

 (carriage return) are replaced with   (space).
 s are collapsed to a single
 , and initial and/or final &x20;s are deleted.
These three levels of normalization correspond to the processing mandated in XML 1.0 for element content, CDATA attribute content and tokenized attributed content, respectively. See Attribute Value Normalization in [XML] for the precedent for replace and collapse for attributes. Extending this processing to element content is necessary to ensure a consistent schema validation semantics for simple types, regardless of whether they are applied to attributes or elements. Performing it twice in the case of attributes whose normalized value has already been subject to replacement or collapse on the basis of information in a DTD is necessary to ensure consistent treatment of attributes regardless of the extent to which DTD-based information has been made use of during infoset construction.
NOTE: Even when DTD-based information has been appealed to, and Attribute Value Normalization has taken place, the above definition of normalized value may mean further normalization may take place, as for instance when character entity references in attribute values result in whitespace characters other than spaces in their initial values.
Many properties are identified below as having (sets of) other schema components as values. For the purposes of exposition, the definitions in this section assume that (unless the property is explicitly identified as optional) all such values are in fact present. When schema components are constructed from XML representations involving reference by name to other components, this assumption may be violated if one or more references cannot be resolved. This specification addresses the matter of missing components in a uniform manner, described in Missing Sub-components (§7.3): no mention of handling missing components will be found in the individual component descriptions below.
As the above makes clear, at the level of schema components and schema validation, reference to components by name is normally not involved. In a few cases, however, qualified names appearing in information items being validated must be resolved to schema components by such lookup. The following constraint is appealed to in these cases.
| 1.1 | the {type definitions} if the kind specified is simple or complex type definition; |
| 1.2 | the {attribute declarations} if the kind specified is attribute declaration; |
| 1.3 | the {element declarations} if the kind specified is element declaration; |
| 1.4 | the {attribute group definitions} if the kind specified is attribute group; |
| 1.5 | the {model group definitions} if the kind specified is model group; |
| 1.6 | the {notation declarations} if the kind specified is notation declaration; |
NOTE: A schema and its components as defined in this chapter are an idealisation of the information a schema-aware processor requires: implementations are not constrained in how they provide it. In particular, no implications about literal embedding versus indirection follow from the use above of language such as "properties . . . having . . . components as values".
At the abstract level, the schema itself is just a container for its components.
Schema Component: Schema
- {type definitions}
- A set of named simple and complex type definitions
- {attribute declarations}
- A set of named global attribute declarations
- {element declarations}
- A set of named global element declarations
- {attribute group definitions}
- A set of named attribute group definitions
- {model group definitions}
- A set of named model group definitions
- {notation declarations}
- A set of notation declarations
- {annotations}
- A set of annotations
See XML Representations of Schemas (§4.1) for the XML representation of schemas and Schema Constraints (§5.13) for constraints on schemas as such.
targetNamespace matches the sibling [namespace
URI] property above (or was absent
but contributed components to that namespace by being included
by a schema document with that targetNamespace as per Assembling a schema for a single target namespace from multiple schema definition documents (§6.2.1)).
The [schema components] property is provided for processors which wish to provide a single access point to some or all of the components of the schema used during validation. Lightweight processors are free to leave it empty.
Attribute declarations provide for:
The attribute declaration schema component has the following properties:
Schema Component: Attribute Declaration
- {name}
- An NCName as defined by [XML-Namespaces].
- {target namespace}
- Either absent or a namespace URI, as defined in [XML-Namespaces].
- {simple type definition}
- A simple type definition.
- {scope}
- Optional. Either global or a complex type definition.
- {value constraint}
- Optional. A pair consisting of a string and, optionally, one of default, fixed.
- {annotation}
- Optional. An annotation
The {name} property must match the local part of the names of attributes being validated.
A {scope} of global identifies attribute declarations available for use in complex type definitions throughout the schema. Locally scoped declarations are available for use only within the complex type definition identified by the {scope} property. This property is also absent in the case of non-global declarations within attribute group definitions: their scope will be determined when they are used in the construction of complex type definitions.
A non-absent value of the {target namespace} property provides for validation of namespace-qualified attribute information items (which must be explicitly prefixed in the character-level form of XML documents). absent values of {target namespace} validate unqualified (unprefixed) items.
The value of the attribute must conform to the supplied {simple type definition}.
{value constraint} reproduces the functions of XML 1.0 default and #FIXED attribute values. fixed indicates that the attribute value must match the supplied constraint string; default specifies that the attribute is to appear unconditionally in the post-schema-validation information set, with the supplied value used whenever the attribute is not actually present.
See Annotation Details (§3.12) for the significance of the {annotation} property.
NOTE: A more complete and formal presentation of the semantics of {name}, {target namespace} and default {value constraint} is provided in conjunction with other aspects of complex type validation (see Element Children and Attributes Valid (§3.4).)
[XML-Infoset] distinguishes namespace declarations such as xmlns or xmlns:xsl from
attributes. Accordingly, it is unnecessary and in fact not possible for
schemas to contain attribute declarations corresponding to such
namespace declarations, see xmlns Not Allowed (§5.1). No means is provided in
this specification to supply a
default value for a namespace declaration.
See XML Representation of Attribute Declaration Schema Components (§4.3.1) for the XML representation of attribute declarations and Attribute Declaration Constraints (§5.1) for constraints on attribute declaration components as such.
| 1.1.1 | The [namespace URI] is not absent and the [local name] and [namespace URI] resolve to an attribute declaration, as defined by QName resolution (Instance) (§3); |
| 1.1.2 | The item is schema-valid with respect to that declaration, as defined by Attribute Valid (§3.2) |
| 1.2 | The [namespace URI] is absent, or the [local name] and [namespace URI] do not resolve to an attribute declaration, as defined by QName resolution (Instance) (§3); |
| 1.1 | a [schema normalized value] property, whose value is the normalized value of the item as validated; |
| 1.2.1 | a single [type definition] property, containing an information item isomorphic to the attribute declaration's {simple type definition} component itself, that is, a Simple Type Definition information item with one property per property of the component, with the same name, and value either the same atomic value, or an information item corresponding in the same way to its component value, recursively, as necessary. |
| 1.2.2 | if the [type definition] has {variety} union, then additionally there is a [member type definition] property, containing an information item isomorphic to that member of the {member type definitions} which actually validated the attribute item's [normalized value]. |
| 1.3.1 | four properties as described in Element Validated by Type (§3.3), except that the {simple type definition} is used wherever the actual type definition is called for therein. |
| 1.3.2 | if the [type definition] has {variety} union, then there are three additional properties as described in the parallel case for Element Validated by Type (§3.3), where the actual member type definition is that member of the {member type definitions} which actually validated the attribute item's [normalized value]. |
See below under Element Validated by Type (§3.3) for a discussion of the alternatives given above.
Also, if the declaration has a {value constraint}, the item's [schema default] is set to the declaration's {value constraint} string.
Finally, if an attribute is laxly but not strictly valid, that is Attribute Valid (§3.2) does not hold but Attribute Valid (Lax) (§3.2) does, the information described above under 1.2.1 or 1.3.1 above is provided with respect to the simple ur-type definition.
If an attribute information item's schema-validity as defined by Attribute Valid (§3.2) has not been assessed, but its lax schema-validity as defined by Attribute Valid (Lax) (§3.2) has been assessed, in the post-schema validation infoset the item has a [validation attempted] property with the value partial.
If an attribute information item's schema-validity, as defined by either Attribute Valid (§3.2) or Attribute Valid (Lax) (§3.2), has been assessed, whether successfully or not, then in the post-schema validation infoset the item has a [validation context] property whose value is the lowest containing element information item with a [schema information] property.
Element declarations provide for:
The element declaration schema component has the following properties:
Schema Component: Element Declaration
- {name}
- An NCName as defined by [XML-Namespaces].
- {target namespace}
- Either absent or a namespace URI, as defined in [XML-Namespaces].
- {scope}
- Optional. Either global or a complex type definition.
- {type definition}
- Either a simple type definition or a complex type definition.
- {nullable}
- A boolean
- {value constraint}
- Optional. A pair consisting of a string and one of default, fixed.
- {identity-constraint definitions}
- A set of constraint definitions.
- {substitution group affiliation}
- Optional. A global element definition.
- {substitution group exclusions}
- A subset of {extension, restriction}.
- {disallowed substitutions}
- A subset of {substitutionGroup, extension, restriction}.
- {abstract}
- A boolean
- {annotation}
- Optional. An annotation
The {name} property must match the local part of the names of element information items being validated.
A {scope} of global identifies element declarations available for use in content models throughout the schema. Locally scoped declarations are available for use only within the complex type identified by the {scope} property. This property is absent in the case of non-global declarations within named model groups: their scope will be determined when they are used in the construction of complex type definitions.
A non-absent value of the {target namespace} property provides for validation of namespace-qualified element information items. absent values of {target namespace} validate unqualified items.
An element information item is schema-valid if it obeys the schema validity constraints of the {type definition}. For such an item, the schema information set contributions from the {type definition} are applied to the corresponding element information item in the post-schema-validation information set.
If {nullable} is true, then an element is also
schema-valid if it
carries the namespace qualified attribute with [local name] null from namespace http://www.w3.org/2000/10/XMLSchema-instance and value true (see xsi:null (§2.6.2)) even if it has
no text or element content despite a {content type} which would
otherwise require content. Formal details of element validation are described in Element Valid (Explicit) (§3.3).
{value constraint} establishes a default or fixed value for an element. If default is specified, and if the element being validated is empty, then the supplied constraint string becomes the [schema normalized value] of the validated element in the post-schema-validation infoset. If fixed is specified, then the element's content must either be empty, in which case fixed behaves as default, or it must match the supplied constraint string.
{identity-constraint definitions} express constraints establishing uniquenesses and reference relationships among the values of related elements and attributes. See Identity-constraint Definition Details (§3.10).
Element declarations are members of the substitution group, if any, identified by {substitution group affiliation}. Membership is transitive but not symmetric; an element declaration is implicitly a member of any group of which its {substitution group affiliation} is a member.
An empty {substitution group exclusions} allows a declaration to be nominated as the {substitution group affiliation} of other element declarations having the same {type definition} or types derived therefrom. The explicit values of {substitution group exclusions} rule out element declarations having types which are extensions or restrictions respectively of {type definition}. If both values are specified, then the declaration may not be nominated as the {substitution group affiliation} of any other declaration.
The supplied values for {disallowed substitutions} determine whether an element declaration appearing in a content model will be prevented from additionally validating elements (a) with an xsi:type (§2.6.1) that identifies an extension or restriction of the type of the declared element, and/or (b) from validating elements which are in the same substitution group as the declared element. If {disallowed substitutions} is empty, then all derived types and substitution group members are valid.
Element declarations for which {abstract} is true can appear in content models only when substitution is allowed; such declarations may not themselves ever be used to validate element content.
See XML Representation of Element Declaration Schema Components (§4.3.2) for the XML representation of element declarations and Element Declaration Constraints (§5.2) for constraints on element declaration components as such.
| 1.1 |
If {nullable} is false there is no attribute information item among the element
information item's [attributes] whose [namespace URI] is identical to http://www.w3.org/2000/10/XMLSchema-instance and whose [local name] is null;
|
||||
| 1.2 |
If {nullable} is true and there is such an attribute
information item and its normalized value is true, then
|
If there is an attribute information item among the element information item's [attributes] whose [namespace URI] is identical to
http://www.w3.org/2000/10/XMLSchema-instance and whose [local name] is type, then
| 2.1 | The normalized value of that attribute information item is schema-valid with respect to the built-in QName simple type, as defined by String Valid (§3.13); |
| 2.2 | The local name and namespace URI (as defined in QName Interpretation (§4.2)), of the normalized value of that attribute information item resolve to a type definition, as defined in QName resolution (Instance) (§3) -- [Definition:] call this type definition the item type definition; |
| 2.3 | The item type definition is validly derived from the {type definition} given the {disallowed substitutions}, as defined in Type Derivation OK (Complex) (§5.11) (if it is a complex type definition), or given {list}, as defined in Type Derivation OK (Simple) (§5.12) (if it is a simple type definition). |
If the declaration has a {value constraint}, then provided clause 1.2 has not obtained
| 3.1 | If the element information item has no character information item [children] and the actual type definition is a local type definition, the {value constraint} string is schema-valid with respect to the actual type definition as defined by String Valid (§3.13) (if the actual type definition is a simple type definition) or else by its {content type} (if that is a simple type definition) or else (the actual type definition is a complex type definition whose {content type} is not a simple type definition) the string must be a valid default for the actual type definition as defined in Element Default Valid (Immediate) (§5.2); |
| 3.2 | If the {value constraint} is fixed, the element information item must have no element information item [children], and the string composed of the element information item's character information item [children] in order must be either empty or match the string of the {value constraint}; |
Otherwise (the element information item has character information item [children] or there is no {value constraint}) if the actual type definition is a simple type definition, then
| 4.1.1 | The element information item's [attributes] must be empty,
excepting those whose [namespace URI] is identical to http://www.w3.org/2000/10/XMLSchema-instance and whose [local name] is one of type, null, schemaLocation or noNamespaceSchemaLocation; |
| 4.1.2 | The element information item must have no element information item [children]; |
| 4.1.3 | the string composed of the [character code] of each of the element information item's character information item [children] in order must be schema-valid with respect to the actual type definition as defined by String Valid (§3.13) |
| 4.2.1 | The element information item must be schema-valid with respect to the actual type definition as per Element Children and Attributes Valid (§3.4); |
| 4.2.2 | The element information item must be schema-valid with respect to each of the {identity-constraint definitions} as per Identity-constraint Satisfied (§3.10). |
Ed. Note: Priority Feedback Request
The Working Group solicits feedback from implementors and users on the extent to which the xsi:null feature provides useful functionality and satisfactorily addresses requirements in the area of data interchange.
NOTE: The {name} and {target namespace} properties are not mentioned above because they are checked during particle validation, as per Element Sequence Valid (Particle) (§3.8).
| 1.1 | The [local name] and [namespace URI] resolve to an element declaration, as defined by QName resolution (Instance) (§3); |
| 1.2 | The item is schema-valid with respect to that declaration, as defined by Element Valid (Explicit) (§3.3) |
| 1.1 | The item is strictly schema-valid as defined by Element Valid (Strict) (§3.3) |
| 1.2.1 | The [local name] and [namespace URI] does not resolve to an element declaration, as defined by QName resolution (Instance) (§3); |
| 1.2.2 | All the element information item [children] and [attributes] of the item are laxly schema-valid, as defined by this constraint or Attribute Valid (Lax) (§3.2), respectively. |
| 1.1 | a [schema normalized value] property, whose value is the normalized value of the item as validated (unless Element Default Value (§3.3) above has obtained); |
| 1.2.1 | a single [type definition] property, containing an information item isomorphic to the type definition component itself, that is, a Complex Type Definition information item with one property per property of the component, with the same name, and value either the same atomic value, or an information item corresponding in the same way to its component value, recursively, as necessary. |
| 1.2.2 | if the type definition has a simple type definition {content type}, and that type definition has {variety} union, then additionally there is a [member type definition] property, containing an information item isomorphic to that member of the {member type definitions} which actually validated the element item's character information item content. |
| 1.3.1 |
four properties as follows:
|
| 1.3.2 |
if the type definition has a
simple type definition {content type}, and that type
definition has {variety} union, then calling
[Definition:] that
member of the {member type definitions} which actually
validated the element item's character information item content the
actual member type definition, there are three additional properties:
|
The first alternative above is provided for applications such as query processors which need access to the full range of details about how an item was validated, for example the type hierarchy; the second, for lighter-weight processors for whom representing the significant parts of the type hierarchy as information items might be a significant burden.
Also, if the declaration has a {value constraint}, the item's [schema default] property is set to that {value constraint}'s string.
Finally, if an element is laxly but not strictly valid, that is Element Valid (Explicit) (§3.3) and/or Element Valid (Strict) (§3.3) do not hold but Element Valid (Lax) (§3.3) does, the information described above under 1.2.1 or 1.3.1 above is provided with respect to the ur-type definition.
| 1 | a single [element declaration] property, containing an information item isomorphic to the declaration component itself, that is, an Element Declaration item with one property per property of the component, with the same name, and value either the same atomic value, or an information item corresponding in the same way to its component value, recursively, as necessary. |
| 2 | a [null] property, with value true if clause 1.2 of Element Valid (Explicit) (§3.3) above obtains, otherwise false. |
| 1.1 | an element information item's schema-validity as defined by Element Valid (Explicit) (§3.3) has been assessed, whether successfully or not; |
| 1.2 | all its element information item children have the value full for their [validation attempted] property, |
If an element information item's schema-validity as defined by Element Valid (Explicit) (§3.3) has not been assessed, or has been but the above clause is not satisfied, but its lax schema-validity as defined by Element Valid (Lax) (§3.3) has been assessed, in the post-schema validation infoset the item has a [validation attempted] property with the value partial.
If an element information item's schema-validity, as defined by either Element Valid (Explicit) (§3.3) or Element Valid (Explicit) (§3.3), has been assessed, whether successfully or not, then in the post-schema validation infoset the item has a [validation context] property whose value is the lowest containing element information item with a [schema information] property.
Complex Type Definitions provide for:
A complex type definition schema component has the following properties:
Schema Component: Complex Type Definition
- {name}
- Optional. An NCName as defined by [XML-Namespaces].
- {target namespace}
- Either absent or a namespace URI, as defined in [XML-Namespaces].
- {base type definition}
- Either a simple type definition or a complex type definition.
- {derivation method}
- Either extension or restriction.
- {final}
- A subset of {extension, restriction}.
- {abstract}
- A boolean
- {attribute declarations}
- A set of pairs of a boolean and an attribute declaration.
- {attribute wildcard}
- Optional. A wildcard.
- {content type}
- One of empty, a simple type definition or a pair consisting of a content model (I.e a Particle (§2.2.3.2)) and one of mixed, element-only.
- {prohibited-substitutions}
- A subset of {extension, restriction}.
- {annotations}
- A set of annotations.
Complex types definitions are identified by their {name} and {target namespace}. Except for anonymous complex type definitions (those with no {name}), since type definitions (i.e. both simple and complex type definitions taken together) must be uniquely identified within an XML Schema, no complex type definition can have the same name as another simple or complex type definition. Complex type {name}s and {target namespace}s are provided for reference from instances (see xsi:type (§2.6.1)), and for use in the XML Representation of Schemas and Schema Components (§4) (specifically in element). See References to schema components across namespaces (§6.2.3) for the use of component identifiers when importing one schema into another.
NOTE: The {name} of a complex type is not ipso facto the [(local) name] of the element information items validated by that definition. The connection between a name and a type definition is described in Element Declaration Details (§3.3).
As described in Type Definition Hierarchy (§2.2.1.1), each complex type is derived from a {base type definition} which is itself either a Simple Type Definition (§2.2.1.2) or a Complex Type Definition (§2.2.1.3). {derivation method} specifies the means of derivation as either extension or restriction (see Type Definition Hierarchy (§2.2.1.1)).
A complex type with an empty specification for {final} can be used as a {base type definition} for other types derived by either of extension or restriction; the explicit values extension, and restriction prevent further derivations by extension and restriction respectively. If all values are specified, then the complex type is said to be [Definition:] final: no further derivations are possible.
A complex type for which {abstract} is true must not appear as the {type definition} of an Element Declaration (§2.2.2.1), and must not be referenced from an xsi:type (§2.6.1) attribute in an instance document; such abstract complex types can be used as {base type definition}s, but they are never used directly to validate element content.
{attribute declarations} are a set of [Definition:] attribute use pairs: each is a pair of a boolean and an individual Attribute Declaration (§2.2.2.3) to be used for schema-validating the [attributes] of element information items, where the boolean determines whether the attribute is required or not. See Element Children and Attributes Valid (§3.4) and Attribute Valid (§3.2) for details of attribute validation.
{attribute wildcard}s provide a more flexible specification for validation of attributes not explicitly included in {attribute declarations}. Informally, the specific values of {attribute wildcard} are interpreted as follows: