W3C

XML processor profiles

W3C Working Draft 18 May 2010

This version:
http://www.w3.org/TR/2010/WD-xml-proc-profiles-20100518/
Latest version:
http://www.w3.org/TR/xml-proc-profiles/
Editors:
Henry S. Thompson, University of Edinburgh <ht@inf.ed.ac.uk>
Norman Walsh, MarkLogic Corporation <norman.walsh@marklogic.com>

This document is also available in these non-normative formats: XML.


Abstract

This specification defines several XML processor profiles, each of which fully determines a data model for any given XML document. It is intended as a resource for other specifications, which can by a single normative reference establish precisely what input processing they require.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is a first public Working Draft for review by W3C members and other interested parties. This document is a product of the XML Processing Model Working Group which is part of the W3C XML Activity. The English version of this specification is the only normative version. However, for translations of this document, see http://www.w3.org/2003/03/Translations/byTechnology?technology=xproc.

The Working Group invites review of this draft, which is likely to be the only draft before publication as a Last Call Working Draft. Please send comments on this draft to the public mailing list public-xml-processing-model-comments@w3.org (public archives are available). Please include the string "[xml-proc-profiles]" in your email subject line.

As this specification is intended for use by other specifications which themselves define one or more XML languages, the Working Group particularly welcomes input for other Working Groups who are responsible for such specifications.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1 Background
2 The minimum XML processor profile
3 The basic XML processor profile
4 Other profiles (non-normative)
5 Conformance

Appendix

A References
    A.1 Normative References
    A.2 Non-normative References


1 Background

The XML specification [Extensible Markup Language (XML) 1.0 (Fifth Edition)] defines an XML processor as "a software module. . .used to read XML documents and provide access to their content and structure. . .on behalf of another module, called the application." XML applications are often defined by building on top of the [XML Information Set (Second Edition)] or other similar XML data models such as [XML Path Language (XPath) Version 1.0] or [XQuery 1.0 and XPath 2.0 Data Model (XDM)], understood as the output of an XML processor. Such definitions have suffered to some extent from an uncertainty inherent in using that kind of foundation, in that the mappingXML processors perform from XML documents to data model is not rigid. Some of this stems from the XML specification itself, which leaves open the possiblity of reading and interpreting external entities, or not. Some stems from the growth of the XML family of specifications: if the input document includes uses of XInclude, for instance.

This specification addresses this issue by defining several XML processor profiles, each of which fully determines a data model for any given XML document. It is intended as a resource for other specifications, which can by a single normative reference establish precisely what input processing they require.

The profiles defined here are appropriate for processing both XML 1.0 [Extensible Markup Language (XML) 1.0 (Fifth Edition)] and XML 1.1 [Extensible Markup Language (XML) 1.1 (Second Edition)] documents. References to XML or XML Namespaces below should be understood as references to 1.0 or 1.1 as required by the relevant document or application.

2 The minimum XML processor profile

The minimum approach to the construction of a data model from a well-formed and namespace well-formed XML document requires the following:

  1. Processing of the document as required of conformant non-validating XML processors while reading all external markup declarations;

  2. Maintenance of the base URI property of each element in conformance with [XML Base];

3 The basic XML processor profile

The basic recommended approach to the construction of a data model from a well-formed and namespace well-formed XML document requires the following:

  1. Processing of the document as required of conformant non-validating XML processors while reading all external markup declarations;

  2. Maintenance of the base URI property of each element in conformance with [XML Base];

  3. Identification of all xml:id attributes as IDs as required by [xml:id Version 1.0]

  4. Replacement of all include elements in the XInclude namespace, and namespace, xml:base and xml:lang fixup of the result, as required for conformance to [XML Inclusions (XInclude) Version 1.0 (Second Edition)].

The following [XProc: An XML Pipeline Language] pipeline, when implemented by a conformant processor which processes its input as required by point (1) above, implements the default process:

4 Other profiles (non-normative)

The profiles defined here, particularly the 3 The basic XML processor profile, can be used as a starting point for the definition of further profiles. For example, the media type registrations for stylesheet languages applicable to XML such as text/xsl or text/css might define a profile specifying appropriate <?xml-stylesheet type="[their media type]" . . .?> processing in addition to the processing required by 3 The basic XML processor profile.

5 Conformance

Conformance is a matter for any specification which references this one to mandate, expressed in terms such as "Conforming implementations must construct input data models from XML documents as required by the basic XML processor profile."

A References

A.1 Normative References

XProc: An XML Pipeline Language
XProc: An XML Pipeline Language, Norman Walsh, Alex Milowski, and Henry S. Thompson, Editors. World Wide Web Consortium, 9 March 2010. This version is http://www.w3.org/TR/2010/REC-xproc-20100309/. The latest version is available at http://www.w3.org/TR/xproc/.
xml:id Version 1.0
xml:id Version 1.0, Norman Walsh, Daniel Veillard, and Jonathan Marsh, Editors. World Wide Web Consortium, 09 Sep 2005. This version is http://www.w3.org/TR/2005/REC-xml-id-20050909/. The latest version is available at http://www.w3.org/TR/xml-id/.
XML Inclusions (XInclude) Version 1.0 (Second Edition)
XML Inclusions (XInclude) Version 1.0 (Second Edition), David Orchard, Jonathan Marsh, and Daniel Veillard, Editors. World Wide Web Consortium, 15 Nov 2006. This version is http://www.w3.org/TR/2006/REC-xinclude-20061115/. The latest version is available at http://www.w3.org/TR/xinclude/.
Extensible Markup Language (XML) 1.0 (Fifth Edition)
Extensible Markup Language (XML) 1.0 (Fifth Edition), Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, et. al., Editors. World Wide Web Consortium, 28 Nov 2008. This version is http://www.w3.org/TR/2008/REC-xml-20081126/. The latest version is available at http://www.w3.org/TR/xml/.
Extensible Markup Language (XML) 1.1 (Second Edition)
Extensible Markup Language (XML) 1.1 (Second Edition), Tim Bray, John Cowan, Jean Paoli, et. al., Editors. World Wide Web Consortium, 16 Aug 2006. This version is http://www.w3.org/TR/2006/REC-xml11-20060816/. The latest version is available at http://www.w3.org/TR/xml11/.
Namespaces in XML 1.0 (Second Edition)
Namespaces in XML 1.0 (Second Edition), Tim Bray, Dave Hollander, Richard Tobin, and Andrew Layman, Editors. World Wide Web Consortium, 16 Aug 2006. This version is http://www.w3.org/TR/2006/REC-xml-names-20060816/. The latest version is available at http://www.w3.org/TR/xml-names/.
Namespaces in XML 1.1 (Second Edition)
Namespaces in XML 1.1 (Second Edition), Tim Bray, Dave Hollander, Andrew Layman, and Richard Tobin, Editors. World Wide Web Consortium, 16 Aug 2006. This version is http://www.w3.org/TR/2006/REC-xml-names11-20060816/. The latest version is available at http://www.w3.org/TR/xml-names11/.
XML Base
XML Base (Second Edition), Jonathan Marsh, Editor. World Wide Web Consortium, 28 January 2009. This version is http://www.w3.org/TR/2001/REC-xmlbase-20090128/. The latest version is available at http://www.w3.org/TR/xmlbase/.

A.2 Non-normative References

XML Information Set (Second Edition)
XML Information Set (Second Edition), John Cowan and Richard Tobin, Editors. World Wide Web Consortium, 04 Feb 2004. This version is http://www.w3.org/TR/2004/REC-xml-infoset-20040204/. The latest version is available at http://www.w3.org/TR/xml-infoset/.
XML Path Language (XPath) Version 1.0
XML Path Language (XPath) Version 1.0, James Clark and Steven DeRose, Editors. World Wide Web Consortium, 16 Nov 1999. This version is http://www.w3.org/TR/1999/REC-xpath-19991116/. The latest version is available at http://www.w3.org/TR/xpath/.
XQuery 1.0 and XPath 2.0 Data Model (XDM)
XQuery 1.0 and XPath 2.0 Data Model (XDM), Ashok Malhotra, Jonathan Marsh, Norman Walsh, et. al., Editors. World Wide Web Consortium, 21 Nov 2006. This version is http://www.w3.org/TR/2006/PR-xpath-datamodel-20061121/. The latest version is available at http://www.w3.org/TR/xpath-datamodel/.