Copyright ©1999, 2000 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
XML Schema: Structures specifies the XML Schema definition language, which offers facilities for describing the structure and constraining the contents of XML 1.0 documents, including those which exploit the XML Namespace facility. The schema language, which is itself represented in XML 1.0 and uses namespaces, substantially reconstructs and considerably extends the capabilities found in XML 1.0 document type definitions (DTDs). This specification depends on XML Schema Part 2: Datatypes.
This specification of the XML Schema language is a Candidate Recommendation of the World Wide Web Consortium. This means that the XML Schema Working Groupconsiders the specification to be stable and encourages implementation and comment on the specification during this period. The Candidate Recommendation review period ends on 15 December 2000. Please send review comments before the review period ends to www-xml-schema-comments@w3.org (public mailing list archive). Readers may find Description of changes (non-normative) (§J) helpful in identifying the major changes since the Last Call Public Working Draft.
During the Candidate Recommendation phase, although feedback based on any aspect of implementation experience is welcome, there are certain aspects of the design presented herein for which the Working Group is particularly interested in feedback. These are designated priority feedback aspects of the design, and identified as such in editorial notes throughout this draft.
Should this specification prove very difficult or impossible to implement, the Working Group will return the document to Working Draft status and make necessary changes. Otherwise, the Working Group anticipates asking the W3C Director to advance this document to Proposed Recommendation.
This document has been produced as part of the W3C XML Activity. The authors of this document are the XML Schema WG members. Different parts of this specification have different editors.
A list of current W3C working drafts can be found at http://www.w3.org/TR/. They may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress".
This document sets out the structural part (XML Schema: Structures) of the XML Schema definition language.
Chapter 2 presents a Conceptual Framework (§2) for XML Schemas, including an introduction to the nature of XML Schemas and an introduction to the XML Schema abstract data model, along with other terminology used throughout this document.
Chapter 3, Schema Component Details (§3), specifies the precise semantics of each component of the abstract model.
Chapter 4, XML Representation of Schemas and Schema Components (§4), describes how to represent schemas as one or more XML documents, with reference to a DTD and XML Schema for an XML Schema document type, along with a detailed mapping between the elements and attribute vocabulary of this representation and the components and properties of the abstract model.
Chapter 5 presents Schema Component Validity Constraints (§5), which provide detailed constraints on the internal structure of each component of the abstract model.
Chapter 6 presents Schema Access and Composition (§6), including the connection between documents and schemas, the import, inclusion and redefinition of declarations and definitions and the foundations of schema-validity assessment.
Chapter 7 discusses Schemas and schema-validity assessment (§7), including the overall approach to schema-validity assessment of documents, and responsibilities of schema-aware processors.
The normative appendices include a Schema for Schemas (normative) (§A) for the XML representation of schemas and References (normative) (§B).
The non-normative appendices include the DTD for Schemas (non-normative) (§F) and a Glossary (non-normative) (§E).
This document is primarily intended as a language definition reference. As such, although it contains a few examples, it is not designed primarily to serve as a motivating introduction to the design and its features, but rather as a careful and fully explicit definition of that design, suitable for guiding implementations. For those in search of a step-by-step introduction to the design, the non-normative [XML Schema: Primer] is a much better starting point than this document.
The purpose of XML Schema: Structures is to define the nature of XML schemas and their component parts, provide an inventory of XML markup constructs with which to represent schemas, and define the application of schemas to XML documents.
The purpose of an XML Schema: Structures schema is to define and describe a class of XML documents by using schema components to constrain and document the meaning, usage and relationships of their constituent parts: datatypes, elements and their content and attributes and their values. Schemas may also provide for the specification of additional document information, such as normalization and defaulting of attribute and element values. Schemas have facilities for self-documentation. Thus, XML Schema: Structures can be used to define, describe and catalogue XML vocabularies for classes of XML documents.
Any application that consumes well-formed XML can use the XML Schema: Structures formalism to express syntactic, structural and value constraints applicable to its document instances. The XML Schema: Structures formalism allows a useful level of constraint checking to be described and implemented for a wide spectrum of XML applications. However, the language defined by this specification does not attempt to provide all the facilities that might be needed by any application. Some applications may require constraint capabilities not expressible in this language, and so may need to perform their own additional validations.
The definition of XML Schema: Structures depends on the following specifications:,
[XML-Infoset],
[XML-Namespaces],
[XPath], and
[XML Schemas: Datatypes]. Before this specification is finally completed, we will
need to account for any changes [XML Base] makes to the Infoset in the areas of
QName interpretation and value space and
the interpretation of all aspects of schemas involving values identified as
being of type uriReference, including in particular
xsi:schemaLocation, xsi:noNamespaceSchemaLocation and
targetNamespace. See [XML Schemas: Datatypes] for the details of
the uriReference type and all uses of URI references in this specification.
See Required Information Set Items and Properties (normative) (§D) for a tabulation of the information items and properties specified in [XML-Infoset] which this specification requires as a precondition to schema-aware processing.
The following highlighting and typography is used to present technical material in this document:
Special terms are defined at their point of introduction in the text; hyperlinks connect other uses of the term to the definition. For example, a definition of term might read: [Definition:] A term is something we use a lot. The definition is labeled as such and the term is highlighted typographically. The end of the definition is not specially marked in the displayed or printed text.
Non-normative examples are set off typographically and accompanied by a brief explanation:
Example
<schema targetNamespace="http://www.example.com/XMLSchema/1.0/mySchema">And an explanation of the example.
References to properties of information items as defined in [XML-Infoset] are notated as links to the relevant section thereof, set off with square brackets, for example [children].
The definition of each kind of schema component consists of a list of its properties and their contents, followed by descriptions of the semantics of the properties:
Schema Component: Example
- {example property}
- Definition of the property.
References to properties of schema components are notated as links to the relevant definition as exemplified above, set off with curly braces, for instance {example property}.
The correspondence between an element information item which is part
of the XML representation of a schema and one or more schema components is presented in a tableau
which illustrates the element information item(s) involved,
followed by a tabulation of the correspondence between properties of the component
and properties of the information item. Where context may determine which of
several different components may arise, several tabulations, one per context,
are given. In the XML representation, bold-face
attribute names (e.g. count below) indicate a required
attribute information item, and the rest are
optional. Where an attribute information item has an enumerated type
definition, the values are shown separated by vertical bars, as for
size below; if there is a default value, it is shown
following a colon. The allowed content of the information item is
shown as a grammar fragment, using the Kleene operators ?,
* and +. The property correspondences are normative,
as are the illustrations of the XML representation element information items.
NOTE: The illustrations are derived automatically from the Schema for Schemas (normative) (§A). In the case of apparent conflict, the Schema for Schemas (normative) (§A) takes precedence, as it, together with the Schema Representation Constraints provide the normative statement of the form of XML representations.
XML Representation Summary: exampleElement Information Item<example
count = integer
size = (small | medium | large) : medium>
Content: (all | any*)
</example>
Example Schema Component
Property Representation {example property} Description of what the property corresponds to, e.g. the value of the size[attribute]
The following highlighting is used for non-normative commentary in this document:
Ed. Note: Priority Feedback Request
Identification of priority feedback aspects of this draft.
NOTE: General comments directed to all readers.
This chapter gives an overview of XML Schema: Structures at the level of its abstract data model. (Schema Component Details (§3) provides details on this model, and subsequent chapters define a normative representation in XML for the components of the model.) Readers interested primarily in learning to write schema documents may wish to first read [XML Schema: Primer] and then consult XML Representation of Schemas and Schema Components (§4), using the sections below as a guide to the underlying formal structure of the schema language.
An XML Schema consists of components such as type definitions and element declarations. These can be used to assess the validity of well-formed element information items (as defined in [XML-Infoset]), and furthermore may specify augmentations to those items and their descendants. This augmentation makes explicit information which may have been implicit in the original document, such as normalized and/or default values for attributes and elements and the types of element and attribute information items.
Schema-validity assessment has two aspects:
Throughout this specification, [Definition:] the word valid and its derivatives are used to refer to item 1 above, the determination of local schema-validity.
Throughout this specification, [Definition:] the work assessment is used to refer to the whole rich process of local validation, schema-validity assessment and infoset augmentation.
This specification builds on [XML] and [XML-Namespaces]. The concepts and definitions used herein regarding XML are framed at the abstract level of information items as defined in [XML-Infoset]. By definition, this use of the infoset provides a priori guarantees of well-formedness (as defined in [XML]) and namespace conformance (as defined in [XML-Namespaces]) for all candidates for assessment and for all schema documents.
Just as [XML] and [XML-Namespaces] can be described in terms of information items, XML Schemas can be described in terms of an abstract data model. In defining XML Schemas in terms of an abstract data model, this specification rigorously specifies the information which must be available to a conforming XML Schema processor. The abstract model for schemas is conceptual only, and does not mandate any particular implementation or representation of this information. To facilitate interoperation and sharing of schema information, a normative interchange format for schemas is described in XML Representation of Schemas and Schema Components (§4)
NOTE: We have not so far seen any need to reconstruct the XML 1.0 notion of root. For the connection from document instances to schemas, see Layer 3: Web-interoperability (§6.3) and Schemas and schema-validity assessment (§7).
[Definition:] Schema component is the generic term for the building blocks that comprise the abstract data model of the schema. [Definition:] An XML Schema is a set of schema components. There are 12 kinds of component in all, falling into three groups. The primary components are as follows. They may have names, and (except for some element declarations) may be independently accessed:
The secondary components are as follows. Like the primary components, they may have names and be independently accessed:
Finally, the "helper" components provide small parts of other components; they are not independent of their context and cannot be independently accessed:
During validation, [Definition:] declaration components are associated by (qualified) name to information items being validated.
On the other hand, [Definition:] definition components define internal schema components that can be used in other schema components.
[Definition:] Declarations and definitions may have and be identified by names, which are NCNames as defined by [XML-Namespaces].
[Definition:] Several kinds of component have a target namespace, which is either absent or a namespace URI, also as defined by [XML-Namespaces]. The target namespace serves to identify the namespace within which the association between the component and its name exists. In the case of declarations, this in turn determines the namespace URI of, for example, the element information items it may validate.
NOTE: At the abstract level, there is no requirement that the components of a schema share a target namespace. Any schema for use in assessment of documents containing names from more than one namespace will of necessity include components with different target namespaces. This contrasts with the situation at the level of the XML Representation of Schemas and Schema Components (§4), in which each schema document contributes definitions and declarations to a single target namespace.
Validation, defined in detail in Schema Component Details (§3), is a relation between information items and schema components. For example, an attribute information item may validate respect to an attribute declaration, a list of element information items may validate with respect to a content model, and so on. The following sections briefly introduce the kinds of components in the schema abstract data model, other major features of the abstract model, and how they contribute to validation.
The abstract model provides two kinds of type definition component: simple and complex.
[Definition:] This specification uses the phrase type definition in cases where no distinction need be made between simple and complex types.
Type definitions form a hierarchy with a single root. First we describe characteristics of that hierarchy, then provide an introduction to simple and complex type definitions themselves.
[Definition:] Except for a distinguished ur-type definition, every type definition is, by construction, either a restriction or an extension of some other type definition. The graph of these relationships forms a tree known as the Type Definition Hierarchy.
[Definition:] A type definition whose declarations or facets are in a one-to-one relation with those of another specified type definition, with each in turn restricting the possibilities of the one it corresponds to, is said to be a restriction. The specific restrictions might include narrowed ranges or reduced alternatives. Members of a type, A, whose definition is a restriction of the definition of another type, B, are always members of type B as well.
[Definition:] A complex type definition which allows element or attribute content in addition to that allowed by another specified type definition is said to be an extension.
[Definition:] A distinguished ur-type definition is present in each XML Schema, serving as the root of the type definition hierarchy for that schema. The ur-type definition, whose name is anyType, has the unique characteristic that it can function as a complex or a simple type definition, according to context. Specifically, restrictions of the ur-type definition can themselves be either simple or complex type definitions.
[Definition:] A type definition used as the basis for an extension or restriction is known as the base type definition of that definition.
A simple type definition is a set of constraints on strings and information about the values they encode, applicable to the normalized value of an attribute information item or of an element information item with no element children. Informally, it applies to attribute values and text-only content of elements.
Each simple type definition, whether built-in (that is, defined in [XML Schemas: Datatypes]) or user-defined, is a restriction of some particular simple base type definition. For the built-in primitive types, this is the simple version of the ur-type definition, whose name is anySimpleType, which is in turn understood to be a restriction of the ur-type definition. Simple types may also be defined whose members are lists of items themselves constrained by some other simple type definition, or whose membership is the union of the memberships of some other simple type definitions. List and union simple type definitions are also understood as restrictions of the simple ur-type definition.
For details on the composition of simple type definitions and the validation semantics associated with them, see (non-normative) Simple Type Definition Details (§3.13) and [XML Schemas: Datatypes]. The latter also defines an extensive inventory of pre-defined simple types. See (non-normative) XML Representation of Simple Type Definition Schema Components (§4.3.11) for the XML representation of simple type definitions, and Simple Type Definition Constraints (§5.12) for constraints on simple type definition components as such.
A complex type definition is a set of attribute declarations and a content type, applicable to the [attributes] and [children] of an element information item respectively. The content type may require the [children] to contain neither element nor character information items, to be a string which belongs to a particular simple type or to contain a sequence of element information items which conforms to a particular model group, with or without character information items as well.
Each complex type definition is either
or
or
A complex type which extends another does so by having additional content model particles at the end of the other definition's content model, or by having additional attribute declarations, or both.
NOTE: This specification allows only appending, and not other kinds of extensions. This decision simplifies application processing required to cast instances from derived to base type. Future versions may allow more kinds of extension, requiring more complex transformations to effect casting.
See Complex Type Definition Details (§3.4) for the composition and validation semantics of complex type definition schema components, XML Representation of Complex Type Definition Schema Components (§4.3.3) for the XML representation of complex type definitions and Complex Type Definition Constraints (§5.11) for constraints on complex type definition components as such.
There are three kinds of declaration component: element, attribute, and notation. Each is described in a section below. Also included is a discussion of element substitution groups, which is a feature provided in conjunction with element declarations.
An element declaration is an association of a name with a type definition, either simple or complex, an (optional) default value and a (possibly empty) set of identity-constraint definitions. The association is either global or scoped to a containing complex type definition. A global element declaration with name 'A' is broadly comparable to a pair of DTD declarations as follows, where the associated type definition fills in the ellipses:
<!ELEMENT A . . .> <!ATTLIST A . . .>
Element declarations contribute to validation as part of model group validation, when their defaults and type components are checked against an element information item with a matching name and namespace, and by triggering identity-constraint definition validation.
See Element Declaration Details (§3.3) for the composition and validation semantics of element declaration schema components, XML Representation of Element Declaration Schema Components (§4.3.2) for the XML representation of element declarations and Element Declaration Constraints (§5.2) for constraints on element declaration components as such.
In XML 1.0, the name and content of an element must correspond exactly to the element type referenced in the corresponding content model.
[Definition:] Through the new mechanism of element substitution groups, XML Schemas provides a more powerful model supporting substitution of one named element for another. Any global element declaration can serve as the defining element, or head, for an element substitution group. Other global element declarations, regardless of target namespace, can be designated as members of the substitution group headed by this element. In a suitably enabled content model, a reference to the head validates not just the head itself, but elements corresponding to any member of the substitution group as well.
All such members must have type definitions which are either the same as the head's type definition or restrictions or extensions of it. Therefore, although the names of elements can vary widely as new namespaces and members of the substitution group are defined, the content of member elements is strictly limited according to the type definition of the substitution group head.
Note that element substitution groups are not represented as separate components. They are specified in the property values for element declarations (see Element Declaration (§2.2.2.1)).
An attribute declaration is an association between a name and a simple type definition, together with occurrence information and (optionally) a default value. The association is either global, or local to its containing complex type definition. Attribute declarations contribute to validation as part of complex type definition validation, when their occurrence, defaults and type components are checked against an attribute information item with a matching name and namespace.
See Attribute Declaration Details (§3.2) for the composition and validation semantics of attribute declaration schema components, XML Representation of Attribute Declaration Schema Components (§4.3.1) for the XML representation of attribute declarations and Attribute Declaration Constraints (§5.1) for constraints on attribute declaration components as such.
A notation declaration is an association between a name and an identifier for a
notation. For an attribute information item to be valid with respect to a
NOTATION simple type definition, its value must have been declared
with a notation declaration.
See Notation Declaration Details (§3.11) for the composition and validation semantics of notation declaration schema components, XML Representation of Notation Declaration Schema Components (§4.3.9) for the XML representation of notation declarations and Notation Declaration Constraints (§5.8) for constraints on notation declaration components as such.
The model group, particle, and wildcard components contribute to the portion of a complex type definition that controls an element information item's content type.
A model group is a constraint in the form of a grammar fragment that applies to lists of element information items. It consists of a list of particles, i.e. element declarations, wildcards and model groups. There are three varieties of model group:
See Model Group Details (§3.7) for the composition and validation semantics of model group schema components, Complex Type Definition Details (§3.4) for the use of model groups as content models, XML Representation of Model Group Schema Components (§4.3.6) for the XML representation of model groups and Model Group Constraints (§5.7) for constraints on model group components as such.
A particle is a term in the grammar for element content, consisting of either an element declaration, a wildcard or a model group, together with occurrence constraints. Particles contribute to validation as part of complex type definition validation, when they allow anywhere from zero to many element information items or sequences thereof, depending on their contents and occurrence constraints.
[Definition:] A particle can be used in a complex type definition to constrain the validation of the [children] of an element information item; such a particle is called a content model.
NOTE: XML Schema: Structures content models are similar to but more expressive than [XML] content models; unlike [XML], XML Schema: Structures applies content models to the validation of both mixed and element-only content.
See Particle Details (§3.8) for the composition and validation semantics of particle schema components, XML Representation of Model Group Schema Components (§4.3.6) for the XML representation of particles and Particle Constraints (§5.10) for constraints on particle components as such.
A wildcard is a special kind of particle which matches element and attribute information items dependent on their namespace URI, independently of their local names.
See Wildcard Details (§3.9) for the composition and validation semantics of wildcard schema components, XML Representation of Wildcard Schema Components (§4.3.7) for the XML representation of wildcards and Wildcard Constraints (§5.5) for constraints on wildcard components as such.
An identity-constraint definition is an association between a name and one of several varieties of identity-constraint related to uniqueness and reference. All the varieties use [XPath] expressions to pick out sets of information items relative to particular target element information items which are unique, or a key, or a valid reference, within a specified scope. An element information item is only valid with respect to an element declaration with identity-constraint definitions if those definitions are all satisfied for all the descendants of that element information item which they pick out.
See Identity-constraint Definition Details (§3.10) for the composition and validation semantics of identity-constraint definition schema components, XML Representation of Identity-constraint Definition Schema Components (§4.3.8) for the XML representation of identity-constraint definitions and Identity-constraint Definition Constraints (§5.3) for constraints on identity-constraint definition components as such.
There are two kinds of convenience definitions provided to enable the re-use of pieces of complex type definitions: model group definitions and attribute group definitions.
A model group definition is an association between a name and a model group, enabling re-use of the same model group in several complex type definitions.
See Model Group Definition Details (§3.6) for the composition and validation semantics of model group definition schema components, XML Representation of Model Group Definition Schema Components (§4.3.5) for the XML representation of model group definitions and Model Group Definition Constraints (§5.6) for constraints on model group definition components as such.
An attribute group definition is an association between a name and a set of attribute declarations, enabling re-use of the same set in several complex type definitions.
See Attribute Group Definition Details (§3.5) for the composition and validation semantics of attribute group definition schema components, XML Representation of Attribute Group Definition Schema Components (§4.3.4) for the XML representation of attribute group definitions and Attribute Group Definition Constraints (§5.4) for constraints on attribute group definition components as such.
An annotation is information for human and/or mechanical consumers. The interpretation of such information is not defined in this specification.
See Annotation Details (§3.12) for the composition and validation semantics of annotation schema components, XML Representation of Annotation Schema Components (§4.3.10) for the XML representation of annotations and Annotation Constraints (§5.9) for constraints on annotation components as such.
The [XML] specification describes two kinds of constraints on XML documents: well-formedness and validity constraints. Informally, the well-formedness constraints are those imposed by the definition of XML itself (such as the rules for the use of the < and > characters and the rules for proper nesting of elements), while validity constraints are the further constraints on document structure provided by a particular DTD.
The preceding section focussed on validation, that is the constraints on information items which schema components supply. In fact however this specification provides four different kinds of normative statements about schema components, their representations in XML and their contribution to the validation of information items:
The definition of the above constraints sometimes involves many clauses, some as alternatives, some as joint requirements. The presentations below number all clauses: clauses at the same level are either clearly identified as alternatives with words such as either and or, or should be understood as joint.
Schema information set
contributions are not new. XML 1.0
validation augments the XML 1.0 information set in similar ways,
for example by
providing values for attributes not present in instances, and by implicitly
exploiting type information for normalization or access.
(As an example of the latter case, consider the
effect of NMTOKENS on attribute white space, and the semantics of
ID and IDREF.) By including schema
information set contributions, this specification makes explicit some features
that XML 1.0 left implicit.
This specification describes three levels of conformance for schema aware processors. The first is required of all processors. Support for the other two will depend on the application environments for which the processor is intended.
[Definition:] Minimally conforming processors must completely and correctly implement the Constraints on Schemas, Validity Rules, and Schema Information Set Contributions contained in this specification.
[Definition:] Processors which accept schemas in the form of XML documents as described in XML Representation of Schemas and Schema Components (§4) are additionally said to provide conformance to the XML Representation of Schemas. Such processors must, when processing schema documents, completely and correctly implement all Schema Representation Constraints in this specification, and must adhere exactly to the specifications in XML Representation of Schemas and Schema Components (§4) for mapping the contents of such documents to schema components for use in validation and assessment.
NOTE: By separating the conformance requirements relating to the concrete syntax of XML schema documents, this specification admits processors which use schemas stored in optimized binary representations, dynamically created schemas represented as programming language data structures, or implementations in which particular schemas are compiled into executable code such as C or Java. Such processors can be said to be minimally conforming but not necessarily in conformance to the XML Representation of Schemas.
[Definition:] Fully conforming processors are network-enabled processors which support both levels of conformance described above, and which must additionally be capable of accessing schema documents from the World Wide Web according to Representation of Schemas on the World Wide Web (§2.7) and How schema definitions are located on the Web (§6.3.2). .
NOTE: Although this specification provides just these three standard levels of conformance, it is anticipated that other conventions can be established in the future. For example, the World Wide Web Consortium is considering conventions for packaging on the Web a variety of resources relating to individual documents and namespaces. Should such developments lead to new conventions for representing schemas, or for accessing them on the Web, new levels of conformance can be established and named at that time. There is no need to modify or republish this specification to define such additional levels of conformance.
See Schema Access and Composition (§6) for a more detailed explanation of the mechanisms supporting these levels of conformance.
As discussed in XML Schema Abstract Data Model (§2.2), most schema components (may) have names. If all such names were assigned from the same "pool", then it would be impossible to have, for example, a simple type definition and an element declaration both with the name "title" in a given target namespace.
This specification therefore introduces the term [Definition:] symbol space to denote a collection of names, each of which is unique with respect to the others. A symbol space is similar to the non-normative concept of namespace partition introduced in [XML-Namespaces]. There is a single distinct symbol space within a given target namespace for each kind of definition and declaration component identified in XML Schema Abstract Data Model (§2.2), except that within a target namespace, simple type definitions and complex type definitions share a symbol space. Within a given symbol space, names are unique, but the same name may appear in more than one symbol space without conflict. For example, the same name can appear in both a type definition and an element declaration, without conflict or necessary relation between the two.
Locally scoped attribute and element declarations are special with regard to symbol spaces. Every complex type definition defines its own local attribute and element declaration symbol spaces, where these symbol spaces are distinct from each other and from any of the other symbol spaces. So, for example, two complex type definitions having the same target namespace can contain a local attribute declaration for the unqualified name "priority", or contain a local element declaration for the name "address", without conflict or necessary relation between the two.
The XML representation of schema components uses a vocabulary
identified by the namespace URI http://www.w3.org/2000/10/XMLSchema. For brevity, the text and examples in this specification use the prefix
xs: to stand for this namespace; in practice,
any prefix can be used.
XML Schema: Structures also defines several attributes for direct use in any XML documents. These attributes are in a different namespace,
which has the namespace URI http://www.w3.org/2000/10/XMLSchema-instance.
For brevity, the text and examples in this specification use the prefix
xsi: to stand for this latter namespace; in practice,
any prefix can be used.
The Simple Type Definition (§2.2.1.2) or Complex Type Definition (§2.2.1.3) used in validation of an element is usually
determined by reference to the appropriate schema components.
An element information item in an instance may, however,
explicitly assert its type using the attribute xsi:type.
The value of this attribute is a QName; see QName Interpretation (§4.2) for
the means by which the QName is
associated with a type definition.
XML Schema: Structures introduces a mechanism for signalling that an element's content is
missing, or "null" in the terminology of databases. An
element has null content if it has the attribute xsi:null with
the value true. An element so labelled must be empty, but can
carry attributes if permitted by the corresponding complex type.
The xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes can be used in a document to provide
hints as to the physical location of schema documents which may be used for assessment.
See How schema definitions are located on the Web (§6.3.2) for details on the use of these attributes.
On the World Wide Web, schemas are conventionally represented as documents of MIME type "text/xml", conforming to the specifications in XML Representation of Schemas and Schema Components (§4). For more information on the representation and use of schema documents on the World Wide Web see Standards for representation of schemas and retrieval of schema documents on the Web (§6.3.1) and How schema definitions are located on the Web (§6.3.2).
The following sections provide full details on the properties and significance of the schema itself and each kind of schema component. For each property, its range, that is the kinds of values it may have, is defined. This can be understood as defining a schema as a labelled directed graph, where the root is a schema, and every other vertex is a schema component or a literal (string, boolean, number) and every labelled edge a property. The graph is not acyclic: multiple copies of components with the same name in the same symbol space may not exist, so in some cases re-entrant chains of properties must exist. Equality of components for the purposes of this specification is always addressed at the level of names (including target namespaces) within symbol spaces.
[Definition:] Throughout this specification, the term absent is used as a distinguished value denoting absence.
Any property not identified as optional is required to be present, optional properties which are not present are taken to have absent. Any property identified as a having a set, subset or list value may have an empty value unless this is explicitly ruled out: this is not the same as absent. Any property value identified as superset or subset of some set may be equal to that set, unless a proper superset or subset is explicitly called for. By 'string' in Part 1 of this specification is meant a sequence of ISO 10646 character codes identified as legal XML character codes in [XML].
NOTE: Readers whose primary interest is in the XML representation of schemas may wish to skip this chapter on the first reading, concentrating on XML Representation of Schemas and Schema Components (§4) and [XML Schema: Primer].
Throughout this specification, [Definition:] when we refer to the initial value of some attribute information item, we mean by this the value of the [normalized value] property of that item. Similarly, when we refer to the initial value of an element information item, we mean the string composed of, in order, the [character code] of each character information item in the [children] of that element information item.
The above definition means that comments and processing instructions, even in the midst of text, are ignored for all validation purposes.
[Definition:] When we refer to the normalized value of an element or attribute information item, we mean an initial value whose white space, if any, has been normalized according to the value of the whiteSpace facet of the simple type definition used in its validation:
#x9 (tab), #xA (linefeed) and
#xD (carriage return) are replaced with #x20 (space).
#x20s are collapsed to a single
#x20, and initial and/or final #x20s are deleted.
These three levels of normalization correspond to the processing mandated in XML 1.0 for element content, CDATA attribute content and tokenized attributed content, respectively. See Attribute Value Normalization in [XML] for the precedent for replace and collapse for attributes. Extending this processing to element content is necessary to ensure a consistent validation semantics for simple types, regardless of whether they are applied to attributes or elements. Performing it twice in the case of attributes whose [normalized value] has already been subject to replacement or collapse on the basis of information in a DTD is necessary to ensure consistent treatment of attributes regardless of the extent to which DTD-based information has been made use of during infoset construction.
NOTE: Even when DTD-based information has been appealed to, and Attribute Value Normalization has taken place, the above definition of normalized value may mean further normalization takes place, as for instance when character entity references in attribute values result in white space characters other than spaces in their initial values.
Many properties are identified below as having other schema components or sets of components as values. For the purposes of exposition, the definitions in this section assume that (unless the property is explicitly identified as optional) all such values are in fact present. When schema components are constructed from XML representations involving reference by name to other components, this assumption may be violated if one or more references cannot be resolved. This specification addresses the matter of missing components in a uniform manner, described in Missing Sub-components (§7.3): no mention of handling missing components will be found in the individual component descriptions below.
It is important to recognise that processors cannot be sure that names will not resolve only on the basis of the schema document in which they occur. By the time the component corresponding to the XML representation which includes the name is actually needed for validation an appropriately-named component may have become available: see Schemas and schema-validity assessment (§7) for details.
As the above makes clear, at the level of schema components and validation, reference to components by name is normally not involved. In a few cases, however, qualified names appearing in information items being validated must be resolved to schema components by such lookup. The following constraint is appealed to in these cases.
| 1.1 | the {type definitions} if the kind specified is simple or complex type definition; |
| 1.2 | the {attribute declarations} if the kind specified is attribute declaration; |
| 1.3 | the {element declarations} if the kind specified is element declaration; |
| 1.4 | the {attribute group definitions} if the kind specified is attribute group; |
| 1.5 | the {model group definitions} if the kind specified is model group; |
| 1.6 | the {notation declarations} if the kind specified is notation declaration; |
NOTE: A schema and its components as defined in this chapter are an idealisation of the information a schema-aware processor requires: implementations are not constrained in how they provide it. In particular, no implications about literal embedding versus indirection follow from the use above of language such as "properties . . . having . . . components as values".
At the abstract level, the schema itself is just a container for its components.
Schema Component: Schema
- {type definitions}
- A set of named simple and complex type definitions
- {attribute declarations}
- A set of named global attribute declarations
- {element declarations}
- A set of named global element declarations
- {attribute group definitions}
- A set of named attribute group definitions
- {model group definitions}
- A set of named model group definitions
- {notation declarations}
- A set of notation declarations
- {annotations}
- A set of annotations
See XML Representations of Schemas (§4.1) for the XML representation of schemas and Schema Constraints (§5.13) for constraints on schemas as such.
Accordingly, [Definition:] we refer below to an item isomorphic to a component, meaning an information item whose type is equivalent to the component's, with one property per property of the component, with the same name, and value either the same atomic value, or an information item corresponding in the same way to its component value, recursively, as necessary.
In the post-schema-validation infoset a [schema information] property is added to the element information item at which assessment began. Its value is a set of namespace schema information information items, one for each namespace URI which appears as the {target namespace} of any schema component in the schema used for that assessment, and one for absent if any schema component in the schema had no {target namespace}. Each namespace schema information information item has the following properties and values:
targetNamespace matches the sibling [schema namespace] property above (or was absent
but contributed components to that namespace by being included
by a schema document with that targetNamespace as per Assembling a schema for a single target namespace from multiple schema definition documents (§6.2.1)).
The [schema components] property is provided for processors which wish to provide a single access point to the components of the schema which was used during assessment. Lightweight processors are free to leave it empty, but if it is provided, it must contain at a minimum all the top-level (i.e. named) components which actually figured in the assessment, either directly or (because an anonymous component which figured is contained within) indirectly.
Attribute declarations provide for:
The attribute declaration schema component has the following properties:
Schema Component: Attribute Declaration
- {name}
- An NCName as defined by [XML-Namespaces].
- {target namespace}
- Either absent or a namespace URI, as defined in [XML-Namespaces].
- {type definition}
- A simple type definition.
- {scope}
- Optional. Either global or a complex type definition.
- {value constraint}
- Optional. A pair consisting of a string and one of default, fixed.
- {annotation}
- Optional. An annotation
The {name} property must match the local part of the names of attributes being validated.
A {scope} of global identifies attribute declarations available for use in complex type definitions throughout the schema. Locally scoped declarations are available for use only within the complex type definition identified by the {scope} property. This property is absent in the case of declarations within attribute group definitions: their scope will be determined when they are used in the construction of complex type definitions.
A non-absent value of the {target namespace} property provides for validation of namespace-qualified attribute information items (which must be explicitly prefixed in the character-level form of XML documents). Absent values of {target namespace} validate unqualified (unprefixed) items.
The value of the attribute must conform to the supplied {type definition}.
{value constraint} reproduces the functions of XML 1.0 default and #FIXED
attribute values. default specifies that the attribute is to appear unconditionally in
the post-schema-validation infoset, with the supplied value used
whenever the attribute is not actually present; fixed indicates that the attribute value if present must match the supplied
constraint string, and if absent receives the supplied value as for default.
See Annotation Details (§3.12) for the significance of the {annotation} property.
NOTE: A more complete and formal presentation of the semantics of {name}, {target namespace} and {value constraint} is provided in conjunction with other aspects of complex type validation (see Element Locally Valid (Complex Type) (§3.4).)
[XML-Infoset] distinguishes namespace declarations such as xmlns or xmlns:xsl from
attributes. Accordingly, it is unnecessary and in fact not possible for
schemas to contain attribute declarations corresponding to such
namespace declarations, see xmlns Not Allowed (§5.1). No means is provided in
this specification to supply a
default value for a namespace declaration.
See XML Representation of Attribute Declaration Schema Components (§4.3.1) for the XML representation of attribute declarations and Attribute Declaration Constraints (§5.1) for constraints on attribute declaration components as such.
| 1.1 | The declaration is non-absent (see Missing Sub-components (§7.3) for how this can fail to be the case); |
| 1.2 | its normalized value is locally valid with respect to that {type definition} as per String Valid (§3.13); |
| 1.3 | its normalized value matches the string of the {value constraint}, if it is present and fixed. |
| an [attribute declaration] property, containing an item isomorphic to the declaration component itself. |
| 1.1 | a [schema normalized value] property, whose value is the normalized value of the item as validated; |
| 1.2.1 | a single [type definition] property, containing an item isomorphic to the relevant attribute declaration's {type definition} component. |
| 1.2.2 | if the [type definition] has {variety} union, then additionally there is a [member type definition] property, containing an item isomorphic to that member of the {member type definitions} which actually validated the attribute item's [normalized value]. |
| 1.3.1 | four properties as described in Element Validated by Type (§3.3). |
| 1.3.2 | if the [type definition] has {variety} union, then there are three additional properties as described in the parallel case for Element Validated by Type (§3.3), where the actual member type definition is that member of the {member type definitions} which actually validated the attribute item's [normalized value]. |
See below under Element Validated by Type (§3.3) for a discussion of the alternatives given above.
Also, if the declaration has a {value constraint}, the item's [schema default] is set to the declaration's {value constraint} string.
If the attribute information item was not strictly assessed, then in the post-schema-validation infoset the item has
| 2.1 | a [schema normalized value] property, whose value is the initial value of the item; |
| 2.2 | properties as described above under clauses 1.2.1 or 1.3.1 based on the simple ur-type definition. |
An attribute information item's schema-validity has been assessed if
| 1 |
[Definition:] If the above holds, the attribute information item has been strictly assessed. |
| 1 | If it was strictly assessed, then if it was valid as defined by Attribute Locally Valid (§3.2) then valid, otherwise invalid. |
| 2 | otherwise notKnown |
| 1 | If it was strictly assessed then full |
| 2 | otherwise none |
Element declarations provide for:
The element declaration schema component has the following properties:
Schema Component: Element Declaration
- {name}
- An NCName as defined by [XML-Namespaces].
- {target namespace}
- Either absent or a namespace URI, as defined in [XML-Namespaces].
- {type definition}
- Either a simple type definition or a complex type definition.
- {scope}
- Optional. Either global or a complex type definition.
- {value constraint}
- Optional. A pair consisting of a string and one of default, fixed.
- {nullable}
- A boolean
- {identity-constraint definitions}
- A set of constraint definitions.
- {substitution group affiliation}
- Optional. A global element definition.
- {substitution group exclusions}
- A subset of {extension, restriction}.
- {disallowed substitutions}
- A subset of {substitution, extension, restriction}.
- {abstract}
- A boolean
- {annotation}
- Optional. An annotation
The {name} property must match the local part of the names of element information items being validated.
A {scope} of global identifies element declarations available for use in content models throughout the schema. Locally scoped declarations are available for use only within the complex type identified by the {scope} property. This property is absent in the case of declarations within named model groups: their scope will be determined when they are used in the construction of complex type definitions.
A non-absent value of the {target namespace} property provides for validation of namespace-qualified element information items. Absent values of {target namespace} validate unqualified items.
An element information item is valid if it satisfies the {type definition}. For such an item, schema information set contributions appropriate to the {type definition} are added to the corresponding element information item in the post-schema-validation infoset.
If {nullable} is true, then an element may
also be valid if it
carries the namespace qualified attribute with [local name] null from namespace http://www.w3.org/2000/10/XMLSchema-instance and value true (see xsi:null (§2.6.2)) even if it has
no text or element content despite a {content type} which would
otherwise require content. Formal details of element validation are described in Element Locally Valid (Element) (§3.3).
{value constraint} establishes a default or fixed value for an element. If default is specified, and if the element being validated is empty, then the supplied constraint string becomes the [schema normalized value] of the validated element in the post-schema-validation infoset. If fixed is specified, then the element's content must either be empty, in which case fixed behaves as default, or it must match the supplied constraint string.
{identity-constraint definitions} express constraints establishing uniquenesses and reference relationships among the values of related elements and attributes. See Identity-constraint Definition Details (§3.10).
Element declarations are members of the substitution group, if any, identified by {substitution group affiliation}. Membership is transitive but not symmetric; an element declaration is a member of any group of which its {substitution group affiliation} is a member.
An empty {substitution group exclusions} allows a declaration to be nominated as the {substitution group affiliation} of other element declarations having the same {type definition} or types derived therefrom. The explicit values of {substitution group exclusions} rule out element declarations having types which are extensions or restrictions respectively of {type definition}. If both values are specified, then the declaration may not be nominated as the {substitution group affiliation} of any other declaration.
The supplied values for {disallowed substitutions} determine whether an element declaration appearing in a content model will be prevented from additionally validating elements (a) with an xsi:type (§2.6.1) that identifies an extension or restriction of the type of the declared element, and/or (b) from validating elements which are in the substitution group headed by the declared element. If {disallowed substitutions} is empty, then all derived types and substitution group members are allowed.
Element declarations for which {abstract} is true can appear in content models only when substitution is allowed; such declarations may not themselves ever be used to validate element content.
See XML Representation of Element Declaration Schema Components (§4.3.2) for the XML representation of element declarations and Element Declaration Constraints (§5.2) for constraints on element declaration components as such.
| 1.1 | The declaration is non-absent; |
| 1.2.1 |
If {nullable} is false there is no attribute information item among the element
information item's [attributes] whose [namespace URI] is identical to http://www.w3.org/2000/10/XMLSchema-instance and whose [local name] is null;
|
||||
| 1.2.2 |
If {nullable} is true and there is such an attribute
information item and its normalized value is true, then
|
If there is an attribute information item among the element information item's [attributes] whose [namespace URI] is identical to
http://www.w3.org/2000/10/XMLSchema-instance and whose [local name] is type, then
| 2.1 | The normalized value of that attribute information item is valid with respect to the built-in QName simple type, as defined by String Valid (§3.13); |
| 2.2 | The local name and namespace URI (as defined in QName Interpretation (§4.2)), of the normalized value of that attribute information item resolve to a type definition, as defined in QName resolution (Instance) (§3) -- [Definition:] call this type definition the local type definition; |
| 2.3 | The local type definition is validly derived from the {type definition} given the {disallowed substitutions}, as defined in Type Derivation OK (Complex) (§5.11) (if it is a complex type definition), or as defined in Type Derivation OK (Simple) (§5.12) (if it is a simple type definition). |
If the declaration has a {value constraint}, then provided clause 1.2.2 has not obtained
| 3.1 |
If the element information item has an empty normalized value and the actual type definition is a local type
definition, either
|
||||
| 3.2 | If the {value constraint} is fixed, the element information item must have no element information item [children], and the normalized value must be either empty or match the string of the {value constraint}; |
Otherwise (the element information item has a non-empty normalized value or there is no {value constraint})
| 4.1 | the element information item must be valid with respect to the actual type definition as defined by Element Locally Valid (Type) (§3.3) |
| 4.2 | The element information item must be valid with respect to each of the {identity-constraint definitions} as per Identity-constraint Satisfied (§3.10). |
Ed. Note: Priority Feedback Request
The Working Group solicits feedback from implementors and users on the extent to which the xsi:null feature provides useful functionality and satisfactorily addresses requirements in the area of data interchange.
NOTE: The {name} and {target namespace} properties are not mentioned above because they are checked during particle validation, as per Element Sequence Locally Valid (Particle) (§3.8).
| 1.1 | a single [element declaration] property, containing an item isomorphic to the declaration component itself. |
| 1.2 | a [null] property, with value true if clause 1.2.2 of Element Locally Valid (Element) (§3.3) above obtains, otherwise false. |
| 1.1 | The type definition is non-absent; | ||||||||
| 1.2 |
If the type definition is a simple type
definition, then
|
| 1.1 | a [schema normalized value] property, whose value is the normalized value of the item as validated (provided clause 1.2.1.3 of Element Locally Valid (Type) (§3.3) has applied, and Element Default Value (§3.3) above has not applied); |
| 1.2.1 | a single [type definition] property, containing an item isomorphic to the type definition component itself. |
| 1.2.2 | if the type definition has a simple type definition {content type}, and that type definition has {variety} union, then additionally there is a [member type definition] property, containing an item isomorphic to that member of the {member type definitions} which actually validated the element item's normalized value. |
| 1.3.1 |
four properties as follows:
|
| 1.3.2 |
if the type definition has a
simple type definition {content type}, and that type
definition has {variety} union, then calling
[Definition:] that
member of the {member type definitions} which actually
validated the element item's normalized value the
actual member type definition, there are three additional properties:
|
The first alternative above is provided for applications such as query processors which need access to the full range of details about an item's assessment, for example the type hierarchy; the second, for lighter-weight processors for whom representing the significant parts of the type hierarchy as information items might be a significant burden.
Also, if the declaration has a {value constraint}, the item's [schema default] property is set to that {value constraint}'s string.
Note that if an element is laxly assessed, the information described above under 1.2.1 or 1.3.1 above is provided with respect to the ur-type definition.
During validation, associations between element and attribute information items among the [children] and [attributes] on the one hand, and element and attribute declarations on the other, are established as a side-effect. [Definition:] We call such declarations context-dependent declarations.
So an element information item's schema-validity has been assessed if
| 1.1 |
|