This specification defines several XML processor profiles, each of which defines how any given XML document should be processed, both operationally and in terms of what information must be made available to applications. It is intended as a resource for other specifications, which can by a single normative reference establish precisely what input processing they require as well as what information they require.
This is a public
This is a Last Call Working Draft for review by W3C members and other interested parties. It contains one significant addition to previous drafts: a discussion of validation, as well as extensive editorial changes made in response to reviewers comments on our previous draft. Once again it is the Working Group's intention, since this specification does not require new implementations, as many existing XML processors implement one or more of the profiles defined below, that no Candidate Recommendation version will be published, and that the next step for this specification will be to Proposed Recommendation—interested parties please take note and comment accordingly.
The effective deadline for comments is 29 February 2012. Please send comments on this draft to the public mailing list
As this specification is intended for use by other specifications which themselves define one or more XML languages, the Working Group particularly welcomes input for other Working Groups who are responsible for such specifications.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the
Few specifications are implemented in their entirety, in exactly the same way, by every implementor. Many specifications contain optional features or areas of acknowledged variation and some implementors choose to ignore required features that aren't needed by the community they serve, chosing to trade conformance for other benefits.
In the case of XML, there are exists not only optionality in the XML Recommendation itself, but there are a whole family of additional specifications which an implementor may choose to support or ignore. In principle, there are an enormous number of possible variations. In practice, there are dependencies between the specifications that limit the number of possible variations and implementors aren't motivated to implement completely arbitrary selections.
The Infoset gave the community a vocabulary for discussing the items produced by a parser. This specification gives the community a vocabulary for describing common sets of higher level features by describing profiles, collecting specific sets of features drawn from the family of specifications, and providing names for them.
One goal of this work is to help establish a lower bound on the number and nature of features supported. The ability to communicate by sending XML documents back and forth is predicated on the notion that we have the same understanding of those documents. While we might wish for the richest possible understanding, that's not likely to be supported by the widest range of implementations. Establishing a few basic profiles, we hope, provides a foundation on which other specifications can build.
The XML specification
This specification addresses this issue by defining several XML processor profiles, each of which defines how any given XML document should be processed, both operationally and in terms of what information must be made available to applications. It is intended as a resource for other specifications, which can by a single normative reference establish precisely what input processing they require as well as what information they require.
The profiles presented here are designed for use with respect to static outcomes, that is, to the result of XML processing as (if) produced by a batch process. They do not attempt to address the question of the preservation or lack thereof of information itself, or of information invariants, in the course of incremental construction or in the face of piecemeal modification.
The profiles defined here are appropriate for processing both XML 1.0
The term
The profile
definitions which follow all assume that the starting point is a
Each profile is defined in terms of conformance requirements on processors
with respect to various XML-family specifications, and in terms of requirements
on the information they provide to applications. Information provision requirements are specified by
reference to classes of information items and properties, as further defined in
It is the
The four profiles defined here identify four increasingly rich profiles, in terms of kinds of processing and amount of information provided to applications, starting from a profile very close to what many XML processors do already in their minimal configuration:
The precise nature of each of these profiles is described in the sections which follow.
To conform to the basic profile an XML processor
Process the document as
Maintain the
Accurately provide to the application the information in the document
corresponding to information items and properties in classes
To conform to the id profile an XML processor
Process the document as
Maintain the
Perform ID type assignment for all xml:id
attributes as
required by ID
to the application;
Accurately provide to the application the information in the document
corresponding to information items and properties
in classes
To conform to the external declarations profile an XML processor
Process the document as
Maintain the
Perform ID type assignment for all xml:id
attributes as
required by ID
to the application;
Accurately provide to the application the information in the document
corresponding to information items and properties
in classes
To conform to the full profile an XML processor
Process the document as
Maintain the
Perform ID type assignment for all xml:id
attributes as
required by ID
to the application;
Recursively replace all include
elements in the XInclude
namespace, and carry out namespace, xml:base and xml:lang fixup of the result, as required for
conformance to
Accurately provide to the application the information in the document
corresponding to information items and properties
in classes
The following
Processes its input as required by
Recognizes and xml:id
attributes in
conformance with
For the profile definitions above and the invariants below, we
categorize the information expressed in XML documents, which
may be made available to applications, into a number of
(overlapping) classes. What follows is a complete tabulation of all the
information items and their properties from
The glosses which follow immediately below here are explanatory: the actual class definitions are given in the subsequent table
Items and properties which are fundamental for
all XML applications and so
Items and properties which depend on declarations and so
Items and properties which only are relevant when entity declarations are
Items and properties which depend on declarations.
For
Items and properties which will be present for
validating processors, but for which support by
Items and properties for which support is
implementation-defined. Processors
The tabulation which follows defines the information classes by enumerating their membership in terms of information items and their properties—each class contains all and only those items and properties against which its name appears below.
the item itself |
|
[children] |
|
[document element] |
|
[notations] |
|
[unparsed entities] |
|
[base URI] |
|
[character encoding scheme] |
|
[standalone] |
|
[version] |
|
[all declarations processed] |
|
the item itself |
|
[namespace name] |
|
[local name] |
|
[prefix] |
|
[children] |
|
[attributes] |
|
[namespace attributes] |
|
[in-scope namespaces] |
|
[base URI] |
|
[parent] |
|
the item itself |
|
[namespace name] |
|
[local name] |
|
[prefix] |
|
[normalized value] |
|
[specified] |
|
[attribute type] |
|
[references] to Element Information Items, i.e. for attributes of types IDREF and IDREFS |
|
[references] to Notation and Unparsed Entity Information Items, i.e. for attributes of types ENTITY, ENTITIES and NOTATION |
|
[owner element] |
|
the item itself |
|
[target] |
|
[content] |
|
[base URI] |
|
[notation] |
|
[parent] |
|
This type of information item will not occur at all if standalone="yes"
the item itself |
|
|
|
the item itself |
|
[character code] |
|
[element content whitespace] |
|
[parent] |
|
the item itself |
|
[content] |
|
[parent] |
|
the item itself |
|
|
|
the item itself |
|
|
|
the item itself |
|
|
|
the item itself |
|
[prefix] |
|
[namespace name] |
|
Note: in an effort to maintain consistent relationships in the diagram, the label for the inner-most circle, around “Full Profile”, has been omitted. It should be read as if it was labeled “Perform XInclude processing”.
Every instance of processing a given namespace-well-formed XML
document in conformance with the
In comparing two cases when a given namespace-well-formed XML
document is processed in conformance with
two
[normalized value],
[attribute type],
[references]—These properties may vary for xml:id
attributes
And all the differences listed in the next two sections.
Where an id processor reports an Unexpanded Entity Reference, richer ones will report the entity expansion, that is, they will report some number of information items and their associated properties. For this reason, the information reported from an id processor may differ from that reported by a processor conforming to a richer profile with respect to any or all of Element, Attribute, Character, Comment, Namespace, Processing Instruction and Unexpanded Entity Reference Information Items.
And all the differences listed in the next section.
Parallel to the case for expanding entity references in the previous section, XInclude processing in conformance with the full profile may replace some (XInclude) Element Information Items reported by processing in conformance to other profiles with some amount of different information, corresponding to Element, Attribute, Character, Comment, Namespace and Processing Instruction Information Items.
The profiles defined here can be used as a starting point for the definition of further profiles. For example, the media type registrations for stylesheet languages applicable to XML such as application/xslt+xml
or text/css
might define a profile specifying appropriate <?xml-stylesheet type="[their media type]" . . .?>
processing in addition to the processing required by
Conformance to this specification means conformance by XML processors to profiles, as specified in
Which profile or profiles an XML processor conforms to may depend on how it is configured. The conformance conditions for any specific
processor configuration with respect to each profile are specified in the
corresponding sub-section of
Accordingly, any specification which references this one
normatively
is recommended to do so in terms such as "Conforming implementations
Specifying desired information outcomes is not sufficient to completely determine XML processor behaviour. In particular, if validation is performed and errors detected, the result may be no outcome at all.
A range of schema languages and approaches to validation exist. Some may provide for additional information items and/or properties which are not addressed by this specification. Also, the validation-dependent [element content whitespace] property of Character Information Items may only be reliably provided in conjunction with some approaches to validation, specifically DTD validation.
Furthermore, not all of the profiles defined above
Accordingly, specifications referencing this one should also specify
whether validation is forbidden, optional or required, with respect to which
schema language(s) with what validation control settings, if
any