This document is also available in these non-normative formats: XML and Showing diffs from previous LCWD.
Copyright © 2011 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This specification defines several XML processor profiles, each of which fully determines a data model for any given XML document. It is intended as a resource for other specifications, which can by a single normative reference establish precisely what input processing they require.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is a public Working Draft for review by W3C members and other interested parties. This document is a product of the XML Processing Model Working Group which is part of the W3C XML Activity. The English version of this specification is the only normative version. However, for translations of this document, see http://www.w3.org/2003/03/Translations/byTechnology?technology=xproc.
This is a Last Call Working Draft for review by W3C members and other interested parties. It contains one significant addition to previous drafts: reporting requirements for each profile, specified in terms of a tabulation and categorization of information supplied by XML processors to applications. Once again it is the Working Group's intention, since this specification is not implementable as such, that no Candidate Recommendation version will be published, and that the next step for this specification will be to Proposed Recommendation—interested parties please take note and comment accordingly.
The effective deadline for comments is 16 May 2011. Please send comments on this draft to the public mailing list public-xml-processing-model-comments@w3.org (public archives are available).
As this specification is intended for use by other specifications which themselves define one or more XML languages, the Working Group particularly welcomes input for other Working Groups who are responsible for such specifications.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
1 Background
1.1 Terminology
2 XML processor profiles
2.1 The minimum XML processor profile
2.2 The basic XML processor profile
2.3 The modest XML processor profile
2.4 The recommended XML processor profile
3 Classes of Information
4 Invariants
4.1 Data model invariants within a given profile
4.2 Data model variation between profiles
4.2.1 Between minimum and richer profiles
4.2.2 Between basic and richer profiles
4.2.3 Between modest and recommended profiles
5 Other profiles (non-normative)
6 Conformance
A References
A.1 Normative References
A.2 Non-normative References
The XML specification [Extensible Markup Language (XML) 1.0 (Fifth Edition)] defines an XML processor as "a software module. . .used to read XML documents and provide access to their content and structure. . .on behalf of another module, called the application." XML applications are often defined by building on top of the [XML Information Set] vocabulary or XML data models such as [XML Path Language (XPath) Version 1.0] or [XQuery 1.0 and XPath 2.0 Data Model (XDM)], understood as the output of an XML processor. Such definitions have suffered to some extent from an uncertainty inherent in using that kind of foundation, in that the mapping XML processors perform from XML documents to data model is not rigid. Some of this stems from the XML specification itself, which is not always explicit about what information must be passed from processor to application, and which also leaves open the possiblity of reading and interpreting external entities, or not. Another kind of uncertainty stems from the growth of the XML family of specifications: if the input document includes uses of XInclude, for instance.
This specification addresses this issue by defining several XML processor profiles, each of which fully determines a data model for any given XML document. It is intended as a resource for other specifications, which can by a single normative reference establish precisely what input processing they require.
The profiles defined here are appropriate for processing both XML 1.0 [Extensible Markup Language (XML) 1.0 (Fifth Edition)] and XML 1.1 [Extensible Markup Language (XML) 1.1 (Second Edition)] documents. References to XML or XML Namespaces below should be understood as references to 1.0 or 1.1 as required by the relevant document or application.
[Definition: The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [RFC 2119].]
The term base URI is used in this specification as it is defined in [RFC 3986].
All of the profiles describe the steps necessary to construct a data model from a well-formed and namespace well-formed XML document. This specification does not consider documents that are not namespace well-formed. Documents which are not well-formed are not XML.
Each profile is defined in terms of comformance requirements with respect to various XML-family specifications, and in terms of requirements on the information it provides to applications. Information provision requirements are specified by reference to classes of information items and properties, as further defined in 3 Classes of Information.
The minimum approach to the construction of a data model requires the following:
Processing of the document as required of conformant non-validating XML processors without reading any external markup declarations;
Maintenance of the base URI of each element in conformance with [XML Base];
Faithful provision of the information in the document corresponding to information items and properties in classes A, B', P and X;
The basic approach to the construction of a data model requires the following:
Processing of the document as required of conformant non-validating XML processors without reading any external markup declarations;
Maintenance of the base URI of each element in conformance with [XML Base];
Identification of all xml:id
attributes as IDs as
required by [xml:id Version 1.0]
Faithful provision of the information in the document corresponding to information items and properties in classes A, B', P and X;
The modest approach to the construction of a data model requires the following:
Processing of the document as required of conformant non-validating XML processors while reading and processing all external markup declarations;
Maintenance of the base URI of each element in conformance with [XML Base];
Identification of all xml:id
attributes as IDs as
required by [xml:id Version 1.0]
Faithful provision of the information in the document corresponding to information items and properties in classes A, B and X;
The recommended approach to the construction of a data model requires the following:
Processing of the document as required of conformant non-validating XML processors while reading and processing all external markup declarations;
Maintenance of the base URI of each element in conformance with [XML Base];
Identification of all xml:id
attributes as IDs as
required by [xml:id Version 1.0]
Replacement of all include
elements in the XInclude
namespace, and namespace, xml:base and xml:lang fixup of the result, as required for
conformance to [XML Inclusions (XInclude) Version 1.0 (Second Edition)];
Faithful provision of the information in the document corresponding to information items and properties in classes A, B and X.
The following [XProc: An XML Pipeline Language] pipeline implements the 2.4 The recommended XML processor profile when executed by a conformant XProc processor which
Processes its input as required by point (1) above;
Recognizes and preserves the ID type of all xml:id
attributes in
conformance with [xml:id Version 1.0].
For the profile definitions above and the invariants below, we categorize the information expressed in XML documents into a number of (overlapping) classes. What follows is a complete tabulation of all the information items and their properties from [XML Information Set], annotated with one or more class labels.
Items and properties which must be provided by all profiles.
Items and properties which must be provided by 2.3 The modest XML processor profile and 2.4 The recommended XML processor profile
Items and properties which must be provided by 2.1 The minimum XML processor profile and 2.2 The basic XML processor profile
Items and properties which depend on declarations. For 2.1 The minimum XML processor profile and 2.2 The basic XML processor profile, they will not be provided if the relevant declaration is in an unprocessed external entity, or is after the first reference to an external entity which is not processed.
Items and properties which will be present for validating processors, but for which support by non-validating processors is implementation-defined. Non-validating processors should document whether they provide this information to applications or not.
Items and properties for which support is implementation-defined. Processors should document whether they provide this information to applications or not.
Note:
It is the information itself which is being labelled, not the particular packaging of it implied by the items and properties used in [XML Information Set]. For example, a data model that exposes the information packaged as Character Information Items in [XML Information Set] as an array of strings is in that regard satisfying requirement (3) of 2.1 The minimum XML processor profile.
the item itself | A |
[children] | X |
[document element] | A |
[notations] | B, P |
[unparsed entities] | B, P |
[base URI] | A |
[character encoding scheme] | A |
[standalone] | A |
[version] | A |
[all declarations processed] | A |
the item itself | A |
[namespace name] | A |
[local name] | A |
[prefix] | A |
[children] | A |
[attributes] | A |
[namespace attributes] | A |
[in-scope namespaces] | A |
[base URI] | A |
[parent] | A |
the item itself | A |
[namespace name] | A |
[local name] | A |
[prefix] | A |
[normalized value] | B, P |
[specified] | A |
[attribute type] | B, P |
[references] to Element Information Items, i.e. for attributes of types IDREF and IDREFS | B, P |
[references] to Notation and Unparsed Entity Information Items, i.e. for attributes of types ENTITY, ENTITIES and NOTATION | X |
[owner element] | A |
the item itself | A |
[target] | A |
[content] | A |
[base URI] | A |
[notation] | X |
[parent] | A |
Note:
This type of information item will not occur at all if standalone="yes"
the item itself | B' |
all properties | B' |
the item itself | A |
[character code] | A |
[element content whitespace] | V |
[parent] | A |
the item itself | A |
[content] | A |
[parent] | A |
the item itself | X |
all properties | X |
the item itself | B, P |
all properties | B, P |
the item itself | B, P |
all properties | B, P |
the item itself | A |
[prefix] | A |
[namespace name] | A |
Data models constructed in conformance with one of the profiles defined above will be guaranteed to share certain properties. The following sub-sections describe this in terms of invariants with respect to the information available in the data model.
Any two data models which are both constructed in conformance with the same profile from a given namespace-well-formed XML document will have exactly the same information with respect to the information items and properties which that profile is required to faithfully provision in the data model.
When two data models are constructed in conformance with the two different profiles from a given namespace-well-formed XML document, the information contained therein will in some cases (depending on the specifics of the document in question) differ with repect to the following information items and properties (leaving aside the items and properties classified as implementation-defined above):
[normalized value],
[attribute type],
[references]—These properties may vary for xml:id
attributes
And all the differences listed in the next two sections.
Entirely, in that where a basic processor reports an Unexpanded Entity Reference, richer ones will report the entity expansion, which may be or include entire elements.
Entirely, for the same reason, or, just with respect to [normalized value], [specified], [attribute type] and [references] where a basic processor has not processed the relevant declaration, but a richer one has.
Entirely, per the Element case above
Entirely, in the opposite sense to the Element case above
Entirely, per the Element case above
Entirely, per the Element case above
Entirely, per the Element case above
And all the differences listed in the next section.
Entirely, in that where a modest processor reports an
xinclude
Element, a recommended Processor will report the result of
XInclude processing, which may be or
include entire elements.
Entirely, for the same reason
Entirely, for the same reason
Entirely, for the same reason
Entirely, for the same reason
Entirely, for the same reason
Entirely, for the same reason
The profiles defined here, particularly the 2.4 The recommended XML processor profile, can be used as a starting point for the definition of further profiles. For example, the media type registrations for stylesheet languages applicable to XML such as application/xslt+xml
or text/css
might define a profile specifying appropriate <?xml-stylesheet type="[their media type]" . . .?>
processing in addition to the processing required by 2.4 The recommended XML processor profile.
Conformance is a matter for any specification which references this one to mandate, expressed in terms such as "Conforming implementations must construct input data models from XML documents as required by the recommended XML processor profile."