The presentation of this document has been augmented to identify changes from a previous version. Three kinds of changes are highlighted: new, added text, changed text, and deleted text.


W3C

XProc: An XML Pipeline Language

W3C Working Draft (with revision marks) 1 May 2008

This Version:
http://www.w3.org/TR/2008/WD-xproc-20080501/
Latest Version:
http://www.w3.org/TR/xproc/
Previous versions:
http://www.w3.org/TR/2007/WD-xproc-20071129/
http://www.w3.org/TR/2007/WD-xproc-20070920/
http://www.w3.org/TR/2007/WD-xproc-20070706/
http://www.w3.org/TR/2007/WD-xproc-20070405/
Editors:
Norman Walsh, Sun Microsystems, Inc.
Alex Milowski, Invited expert
Henry S. Thompson, University of Edinburgh

This document is also available in these non-normative formats: XML, Revision markup


Abstract

This specification describes the syntax and semantics of XProc: An XML Pipeline Language, a language for describing operations to be performed on XML documents.

An XML Pipeline specifies a sequence of operations to be performed on zero or more XML documents. Pipelines generally accept zero or more XML documents as input and produce zero or more XML documents as output. Pipelines are made up of simple steps which perform atomic operations on XML documents and constructs similar to conditionals, iteration, and exception handlers which control which steps are executed.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document was produced by the XML Processing Model Working Group which is part of the XML Activity. Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Since the last public working draft, the Working Group has considered severalGroup hundred commentsdecided in nearly 150 threads. We've responded to many of these by changing the specification. Some of the significant changes insupport XPath this draft1.0 are:

  1. Removedas implicit pipeline inputs and outputs. See Section 2.3, “Primary Inputs and Outputs”.

  2. Supportrelaxed either XPaththe two 1.0 or XPath 2.0 as the pipeline expression language. step.

  3. SupportThe Working Group has not both XSLT 1.0 and XSLTthe outstanding 2.0 on a single p:xsltdraft step.

  4. Addedparticular a p:languagesuch system property.

  5. Addedpervasive fixup-xml-baseimpact and fixup-xml-lang optionsthe to p:xinclude.that

  6. Renamedit c:http-requesthas and c:http-responsedecided to simply c:requestpublish and c:response.

  7. Addednew psvi-requireddraft attribute to pipelines.

  8. Fairlyin substantial syntax changes.expose this A p:pipelineUser is now justimplementor syntactic sugar for a particular p:declare-step.be most valuable.

    The
  9. Changedfollowing definition of p:errorreflected to better addressthis localization issues. draft:

  10. SignificantlyAttempt to reworkedsupport both the syntax and semanticsXPath 2.0; of variables, options, andbe done, parameters. Added p:variable. Support both Imposed a syntactic distinction betweenXSLT declaration (p:option)in and use (p:with-option/p:with-paramp:xslt) of options and parameters.step.

  11. Clarified the scope of variables and options. .

  12. Removed value attribute from p:variable, p:option, p:with-option, and p:with-param.

  13. Removed automatic declaration of parameter input ports.on p:input.

  14. Added p:base-uri and p:resolve-uri. functionsAdded to supportp:language (XPath 1.0) pipelines that need access to the base URI of documents. property.

  15. RemovedAdded fixup-xml-base ignored namespaces,fixup-xml-lang options added p:pipeinfo.

  16. RedefinedRenamed c:http-request the p:label-elements stepand c:http-response to use a step-local variable in the XPath context. c:response.

Please send comments about this document to public-xml-processing-model-comments@w3.org (public archives are available).

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.


Table of Contents

Introduction
Pipeline Concepts
2.1 Steps
2.1.1 Step names
2.2 Inputs and Outputs
2.2.1 External Documents
2.3 Primary Inputs and Outputs
2.4 Connections
2.4.1 Namespace Fixup on Outputs
2.5 Environment
2.6 XPaths in XProc
2.6.1 Processor XPath Context
2.6.2 Step XPath Context
2.6.3 XPath Extension Functions
2.7 Variables
2.8 Options
2.9 Parameters
2.10 Security Considerations
2.11 Versioning Considerations
Syntax Overview
3.1 XProc Namespaces
3.2 Scoping of Names
3.3 Base URIs and xml:base
3.4 Unique identifiers
3.5 Associating Documents with Ports
3.6 Documentation
3.7 Processor annotations
3.8 Extension attributes
3.9 Syntax Summaries
Steps
4.1 p:pipeline
4.2 p:for-each
4.2.1 XPath Context
4.3 p:viewport
4.3.1 XPath Context
4.4 p:choose
4.4.1 p:xpath-context
4.4.2 p:when
4.4.3 p:otherwise
4.5 p:group
4.6 p:try
4.6.1 The Error Vocabulary
4.7 Atomic Steps
4.8 ExtensionPipelines Steps
4.8.1 Syntactic Shortcut for Option Values
Other pipeline elements
5.1 p:input
5.1.1 Documentdeclaration and a binding. In other contexts, it is only Inputs
5.1.2 Parameter Inputs
5.2 p:iteration-source
5.3 p:viewport-source
5.4 p:output
5.5 p:log
5.6 p:serialization
5.7 Variables, Options,Options and Parameters
5.7.1 p:variable
5.7.2 p:optionDeclaring Options
5.7.3 p:with-option
5.7.4 p:with-param
5.7.5 Namespaces on variables, options, and parametersNamespaces
5.8 p:declare-step
5.8.1 Declaringuntyped atomic steps
5.8.2 Declaring pipelines
5.9 p:library
5.10 p:import
5.11 p:pipe
5.12 p:inline
5.13 p:document
5.14 p:empty
5.15 p:documentation
5.16 p:pipeinfo
Errors
6.1 Static Errors
6.2 Dynamic Errors
6.3 Step Errors
Standard Step Library
7.1 Required Steps
7.1.1 p:add-attribute
7.1.2 p:add-xml-base
7.1.3 p:compare
7.1.4 p:count
7.1.5 p:delete
7.1.6 p:directory-list
7.1.7 p:error
7.1.8 p:escape-markup
7.1.9 p:http-request
7.1.10 p:identity
7.1.11 p:insert
7.1.12 p:label-elements
7.1.13 p:load
7.1.14 p:make-absolute-uris
7.1.15 p:namespace-rename
7.1.16 p:pack
7.1.17 p:parameters
7.1.18 p:rename
7.1.19 p:replace
7.1.20 p:set-attributes
7.1.21 p:sink
7.1.22 p:split-sequence
7.1.23 p:store
7.1.24 p:string-replace
7.1.25 p:unescape-markup
7.1.26 p:unwrap
7.1.27 p:wrap
7.1.28 p:wrap-sequence
7.1.29 p:xinclude
7.1.30 p:xslt
7.2 Optional Steps
7.2.1 p:exec
7.2.2 p:hash
7.2.3 p:uuid
7.2.4 p:validate-with-relax-ng
7.2.5 p:validate-with-schematron
7.2.6 p:validate-with-xml-schema
7.2.7 p:www-form-urldecode
7.2.8 p:www-form-urlencode
7.2.9 p:xquery
7.2.10 p:xsl-formatter
7.3 Serialization Options

Appendices

1 Introduction

An XML Pipeline specifies a sequence of operations to be performed on a collection of XML input documents. Pipelines take zero or more XML documents as their input and produce zero or more XML documents as their output.

A pipeline consists of steps. Like pipelines, steps take zero or more XML documents as their inputs and produce zero or more XML documents as their outputs. The inputs of a step come from the web, from the pipeline document, from the inputs to the pipeline itself, or from the outputs of other steps in the pipeline. The outputs from a step are consumed by other steps, are outputs of the pipeline as a whole, or are discarded.

There are two kinds of steps: atomic steps and compound steps. Atomic steps carry out single operations and have no substructure as far as the pipeline is concerned, whereas compound steps control the execution of other steps, which they include in the form of one or more subpipelines.

This specification defines a standard library, Section 7, “Standard Step Library”, of steps. Pipeline implementations may support additional types of steps as well.

Figure 1, “A simple, linear XInclude/Validate pipeline” is a graphical representation of a simple pipeline that performs XInclude processing and validation on a document.

A simple, linear XInclude/Validate pipeline
Figure 1. A simple, linear XInclude/Validate pipeline

This is a pipeline that consists of two atomic steps, XInclude and and Validate with XML Schema.Validate. The pipeline itself has two inputs, “source” (a source document) document) and “schemas” (a sequence of W3C XML Schemas). How inputs are connected to XML documents outside the pipeline is implementation-defined. The XInclude step reads the pipeline input “source” and produces a result document. The Validate with XML Schema step reads the pipeline input “schemas” and the the result of the XInclude step and produces its own a result document. The result of the validation, “result”, is the result of the pipeline. (For consistency across the step vocabulary, the standard input is usually named “source” and and the standard output is usually named “result”.)

The pipeline document for this pipeline is shown in Example 1, “A simple, linear XInclude/Validate pipeline”.

Example 1. A simple, linear XInclude/Validate pipeline
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
           name="xinclude-and-validate">
  <p:input port="source" primary="true"/>
  <p:input port="schemas" sequence="true"/>
  <p:output port="result">
    <p:pipe step="validated" port="result"/>
  </p:output>

  <p:xinclude name="included">
    <p:input port="source">
      <p:pipe step="xinclude-and-validate" port="source"/>
    </p:input>
  </p:xinclude>

  <p:validate-with-xml-schema name="validated">
    <p:input port="source">
      <p:pipe step="included" port="result"/>
    </p:input>
    <p:input port="schema">
      <p:pipe step="xinclude-and-validate" port="schemas"/>
    </p:input>
  </p:validate-with-xml-schema>
</p:declare-step>


The example in Example 1, “A simple, linear XInclude/Validate pipeline” is very verbose. It makes all of the connections seen in the figure explicit. In practice, pipelines do not have to be this verbose. XProc supports defaults for many common cases:

  • If you use p:pipeline instead of p:declare-step, the “source” input port and “result” output port are implicitly declared for you.

  • Where inputs and outputs are connected between sequential sibling steps, they do not have to be madecases. explicit.

The same pipeline, using XProc defaults, is shown in Example 2, “A simple, linear XInclude/Validate pipeline (simplified)”.

Example 2. A simple, linear XInclude/Validate pipeline (simplified)
<p:pipeline xmlns:p="http://www.w3.org/ns/xproc"
         name="xinclude-and-validate">name="pipeline" xmlns:p="http://www.w3.org/ns/xproc">
  <p:input port="source" primary="true"/>
  <p:input port="schemas" sequence="true"/>
  <p:output port="result"/>

  <p:xinclude/>

  <p:validate-with-xml-schema>
    <p:input port="schema">
      <p:pipe step="xinclude-and-validate" port="schemas"/>
    </p:input>
  </p:validate-with-xml-schema>
</p:pipeline>

Figure 2, “A validate and transform pipeline” is a more complex example: it performs schema validation with an appropriate schema and then styles the validated document.

A validate and transform pipeline
Figure 2. A validate and transform pipeline

The heart of this example is the conditional. The “choose” step evaluates an XPath expression over a test document. Based on the result of that expression, one or another branch is run. In this example, each branch consists of a single validate step.

Example 3. A validate and transform pipeline
<p:pipeline xmlns:p="http://www.w3.org/ns/xproc">
<p:input port="source"/>
<p:output port="result"/>

  <p:choose>
    <p:when test="/*[@version &lt; 2.0]">
      <p:validate-with-xml-schema><p:validate-with-xml-schema name="val1">
   <p:input port="schema">
          <p:document href="v1schema.xsd"/>
        </p:input>
      </p:validate-with-xml-schema>
    </p:when>

    <p:otherwise>
      <p:validate-with-xml-schema><p:validate-with-xml-schema name="val2">
   <p:input port="schema">
          <p:document href="v2schema.xsd"/>
        </p:input>
      </p:validate-with-xml-schema>
    </p:otherwise>
  </p:choose>

  <p:xslt><p:xslt name="xform">
    <p:input port="stylesheet">
      <p:document href="stylesheet.xsl"/>
    </p:input>
  </p:xslt>
</p:pipeline>

This example, like the preceding, relies on XProc defaults for simplicity. It is always valid to write the fully explicit form if you prefer.

The media type for pipeline documents is application/xml. Often, pipeline documents are identified by the extension .xpl.

2 Pipeline Concepts

[Definition: A pipeline is a set of connected steps, with outputs of one step flowing into into inputs of another.] A pipeline is itself a step and must satisfy the constraints on steps. Connections between steps occur where the input of one step is bound to the output of another.

The result of evaluating a pipeline (or subpipeline) is the result of evaluating the steps that it contains, in an order consistent with the connections between them. A pipeline must behave as if it evaluated each step each time it occurs. Unless otherwise indicated, implementations must not assume that steps are functional (that is, that their outputs depend only on their inputs, explicit inputs, options, and parameters) or side-effect free.

The pattern of connections between steps will not always completely determine their order of evaluation. The evaluation order of steps not connected to one another is implementation-dependent.

2.1 Steps

[Definition: A step is the basic computational unit of a pipeline.] A typical step has zero or more inputs, from which it receives XML documents to process, zeroa or morea default outputs, to which it sends XML document results, andmanufactured can have options and/or parameters.automatically.

There are three kinds of steps: atomic, compound,or and multi-container.

[Definition: An atomic step is a step that performs a unit of XML processing, such as XInclude or transformation, and has no internal subpipeline. subpipeline.] Atomic steps carry out fundamental XML operations and can perform arbitrary amounts of computation, but they are indivisible. An XSLT step, for example, performs XSLT processing; a Validate with XML Schema Validation step validates one input with respect to some set of XML Schemas, etc.

There are many types of atomic steps. The standard library of atomic steps is described in Section 7, “Standard Step Library”, but implementations may provide others as well. What additional step types, if any, are provided is implementation-defined. Each use, or instance, of an atomic step invokes the processing defined by that type of step. A pipeline may contain instances of many types of steps and many instances of the same type of step.

Compound steps, on the other hand, control and organize the flow of documents through a pipeline, reconstructing familiar programming language functionality such as conditionals, iterators and exception handling. They contain other steps, whose evaluation they control.

[Definition: A compound step is a step that contains a subpipelineone or more subpipelines.] That is, a compound step step differs from an atomic step in that its semantics are at least partially determined by the steps that it contains.

Finally, Every compound step contains one or more subpipelines. The steps there are two “multi-container steps”: p:choosecompound step are and p:trycontained steps. [Definition: A multi-containercompound step which immediately is a step that contains several alternateits subpipelinescontainer. ] The steps Each subpipelinethe connections is identified by a non-stepcompound step wrapper element:a subpipeline. p:whenThe andlast p:otherwisestep in a subpipeline is the caselast step in document order within of container. subpipeline = (p:for-each|p:viewport|p:choose, |p:group and |p:catch in the case of |pfx:other-step|p:try.|ipfx:ignored)*

The runtimesimple distinction between atomic semantics of a multi-containeris occasionally step are that it behaveschildren as if it evaluatedsteps, e.g. exactlyp:choose one ofp:try, its subpiplines. In this sense, they functionp:choose, like compound steps.

[Definition: Ap:when compound stepp:otherwise or multi-container step is a containerdifferent pipelines for the stepsmost directly within it or within non-step wrappers directly within it.]of [Definition: Thep:try, steps thatp:catch occur directly within, ora within non-step wrappersa subpipeline directly within, a step are called that step's containedp:group fails. steps.Acknowledging In other words, “container” and “contained steps” are inverse relationships.] [Definition: The ancestorsas of a step are its container,or the containersubpipelines. A ofp:pipeline, its container, andit all other containers above it.]

[Definition: Siblingcan steps (and thecalled connections between them) form a subpipeline.] [Definition: Thesomewhat lastdual stepnature in awith subpipeline isto the its last step in document order.]

Notecompound step. When it that user-defined pipelines, pfx:user-pipeline, areby atomic; although a pipeline declaration,pipeline, containsits invocation a subpipeline, aatomic step. The “type” which invokes a user-defined pipeline doesdetermined by the p:pipeline that defines not.

Steps have “ports” into which inputs and outputs are connected or “bound”. Each step has a number of input ports and a number of output ports; a step can have zero input ports and/or zero output ports. (All steps have an implicit output port for reporting errors that must not be declared.) The names of all ports on each step must be unique on that step (you can't have two input ports named “source”, nor can you have an input port named “schema” and an output port named “schema”).

Steps may have any number of options, options, all with unique names. A step can have zero options.

Steps may have parameter input ports, on which parameters can be passed. The parameters passed on a particular parameter input port must be uniquely named. If multiple parameters with the same name are used, only one of the values will actually be available to the step. A A step can have zero, one, or many parameter input ports, and each parameter port can have zero or more parameters passed on it.

All of the different instances of steps (atomic or compound) in a pipeline can be distinguished from one another by name. If the pipeline author does not provide a name for a step, a default name is manufactured automatically.

2.1.1 Step names

The name attribute on any step can be used to give it a name. The name must be unique within its scope, see Section 3.2, “Scoping of Names”.

If the pipeline author does not provide an explicit name, the processor manufactures a default name. All default names are of the form “!1.m.n…” where “mnis the position of the step's highest ancestor within the pipeline documentwhere or library which contains it,n” is the positionordinal number of the next-highest ancestor, andconsidering all so on, including both steps and non-step wrappers. For example, consider the pipeline in Example 3, “A validate and transform pipeline”. The p:pipeline step has no name, so it gets the default name “!1”; the p:choose gets the name “!1.2”; the first p:when gets the name “!1.2.1”, etc. If the p:choose had had a name, it would not have received a default name, but it would still have been counted and its first p:when would still have been “!1.2.1”.

Providing every step in the pipeline with an interoperable name has several benefits:

  1. It provides a simple mechanism for identifying all steps from outside the pipeline, see . It allows implementors to refer to all steps in an interoperable fashion, for example, in error messages.

  2. Pragmatically, we say that readable ports are identified by a step name/port name pair. By manufacturing names for otherwise anonymous steps, we include implicit bindings without changing our model.

In a valid pipeline that runs successfully to completion, the manufactured names aren't visible (except perhaps in debugging or logging output).

Note

The format for defaulted names does not conform to the requirements of an NCName. This is an explicit design decision; it prevents pipelines from using the defaulted names on p:pipe elements. If an explicit connection is required, the pipeline author must provide an explicit name for the step.

2.2 Inputs and Outputs

Although some steps can read and write non-XML resources, what flows between steps through input ports and output ports are exclusively XML documents or sequences of XML documents.

For the purposes of this specification, an XML document is an [Infoset]. Implementations are free to transmit infosets as sequences of characters, sequences of events, object models, or any other representation that preserves the necessary infoset properties (see Section A.3, “Infoset Conformance”).

Most steps in this specification manipulate XML documents, or portions of XML documents. In these cases, we speak of changing elements, attributes, or nodes without prejudice to the actual representation used by an implementation.

An implementation may make it possible for a step to produce non-XML output (through channels other than a named output port)—for example, writing a PDF document to a URI—but that output cannot flow through the pipeline. Similarly, one can imagine a step that takes no pipeline inputs, reads a non-XML file from a URI, and produces an XML output. But the non-XML data cannot arrive on an input port to a step.

It is a dynamic error (err:XD0001) if a non-XML resource is produced on a step output or arrives on a step input.

The common case is that each step has one or more inputs and one or more outputs. Figure 3, “An atomic step” illustrates symbolically an atomic step with two inputs and one output.

An atomic step with two inputs and one output
Figure 3. An atomic step

All atomic steps are defined by a p:declare-step. The declaration of an atomic step type defines the input ports, output ports, and options of all steps of that type. For example, every p:validate-with-xml-schema step has two inputs, named “source” and “schema”, and one output named “result”, and the same set of options.

Like atomic steps, top level, user-defined pipelines also have declarations. The situation is slightly more complicated for the other compound steps because they don't have separate declarations; each instance of the compound step serves as its own declaration. OnCompound steps don't have declared inputs, but they do have declared outputs, and unlike these steps, on compound steps, the number and names of the outputs can be different different on each instance of the step.

Figure 4, “A compound step” illustrates symbolically a compound step with one subpipeline and one output. As you can see from the diagram, the output from the compound step comes from one of the outputs of the subpipeline within the step.

A compound step with two inputs and one output
Figure 4. A compound step

[Definition: The input ports declared on a step are its declared inputs.] [Definition: The output ports declared on a step are its declared outputs.] When a step is used in a pipeline, it is connected to other steps through its inputs and outputs.

When a step is used, all of the declared inputs of the step must be connected. Each input can be connected to:

  • The output port of some other step.

  • A fixed, inline document or sequence of documents.

  • A document read from a URI.

  • One of the inputs declared on one of its ancestors.ancestors.

  • A special port provided by an ancestor compound step, for example, “current” in a p:for-each or p:viewport.

When an input accepts a sequence of documents, the documents can come from any combination of these locations.

The declared outputs of a step may be connected to:

  • The input port of some other step.

  • One of the outputs declared on its container.

The primary output port of a step must be connected, but other outputs can remain unconnected. Any documents produced on an unconnected output port are discarded.

Output ports on compound steps have a dual nature: from the perspective of the compound step's siblings, its outputs are just ordinary outputs and must be connected as described above. From the perspective of the subpipeline inside the compound step,itself, they are inputs into which something must be connected.

Within a compound step, the declared outputs of the step can be connected to:

  • The output port of some contained step.

  • A fixed, inline document or sequence of documents.

  • A document read from a URI.

Each input and output is declared to accept or produce either a single document or a sequence of documents. It is not an error to connect a port that is declared to produce a sequence of documents to a port that is declared to accept only a single document. It is, however, an error if the former step actually produces more than one document at run time.

It is also not an error to connect a port that is declared to produce a single document to a port that is declared to accept a sequence. A single document is the same as a sequence of one document.

An output port may be connected to more than one input port. At runtime this will result in distinct copies of the output.

[Definition: The signature of a step is the set of inputs, outputs, and options that it is declared to accept.] Each Theatomic step declaration for a step provides a fixed signaturesignature, declared globally or built-in, which all its instances share, whereas each instancescompound step has its own implicit share.

[Definition: A step matches its signature if and only if it specifies an input for each declared input, it specifies no inputs that are not declared, it specifies an option for each option that is declared to be required, and it specifies no options that are not declared.] In other words, every input and required option must be specified and only inputs and options that are declared may be specified. Options that aren't required do not have to be specified.

Steps may also produce error, warning, and informative informative messages. These messages areappear on a special captured and provided on the erroroutput port insideis only bound to an input in the catch clause of a p:catchtry/catch. Outside of a try/catch, the disposition of error messages is implementation-dependent.

2.2.1 External Documents

It's common for some of the documents used in processing a pipeline to be read from URIs. Sometimes this occurs directly, for example with a p:document element. Sometimes it occurs indirectly, for example if an implementation allows the URI of a pipeline input to be specified on the command line or if an p:xslt step encounters an xsl:import in the stylesheet that it is processing. It's also common for some of the documents produced in processing a pipeline to be written to locations which have, or at least could have, a URI.

The process of dereferencing a URI to retrieve a document is often more interesting than it seems at first. On the web, it may involve caches, proxies, and various forms of indirection. Resolving a URI locally may involve resolvers of various sorts and possibly appeal to implementation-dependent mechanisms such as catalog files.

In XProc, the situation is made even more interesting by the fact that many intermediate results produced by steps in the pipeline have base URIs. Whether or not (and when and how) the intermediate results that pass between steps are ever written to a filesystem is implementation-dependent.

In Version 1.0 of XProc, how (or if) implementers provide local resolution mechanisms and how (or if) they provide access to intermediate results by URI is implementation-defined.

Version 1.0 of XProc does not require implementations to guarantee that multiple attempts to dereference the same URI always produce consistent results.

Note

On the one hand, this is a somewhat unsatisfying state of affairs because it leaves room for interoperability problems. On the other, it is not expected to cause such problems very often in practice.

If these problems arise in practice, implementers are encouraged to use the existing extension mechanisms to give users the control needed to circumvent them. Should such mechanisms become widespread, a standard mechanism could be added in some future version of the language.

2.3 Primary Inputs and Outputs

As a convenience for pipeline authors, each step may have one input port designated as the primary input port and one output port designated as the primary output port.

[Definition: If a step has a document input port which is explicitly marked “primary='true'”, or if it has exactly one document input port and that port is not explicitly marked “primary='false'”, then that input port is the primary input port of the step.] If a step has a single input port and that port is explicitly marked “primary='false'”, or if a step has more than one input port and none is explicitly marked as the primary, then the primary input port of that step is undefined. A step can have at most one primary input port.

[Definition: If a step has a document output port which is explicitly marked “primary='true'”, or if it has exactly one document output port and that port is not explicitly marked “primary='false'”, then that output port is the primary output port of the step.] If a step has a single output port and that port is explicitly marked “primary='false'”, or if a step has more than one output port and none is explicitly marked as the primary, then the primary output port of that step is undefined. A step can have at most one primary output port.

The special significance of primary input and output ports is that they are connected automatically by the processor if no explicit binding is given. Generally speaking, if two steps appear sequentially in a subpipeline, then the primary output of the first step will automatically be connected to the primary input of the second.

Additionally, if a compound step has no declared outputs and the last step in its subpipeline has an unbound primary output, then an implicit primary output port (named “result”) will be added to the compound step (and consequently the last step's primary output will be bound to it). This implicitrule does not apply to p:pipeline steps; all of the inputs and outputs of a p:pipeline must be explicitly declared. Options Some steps accept options. Options are name/value pairs. An option is a name/value pair where the name is an expanded name and the value must be a string. If a document, node, or other value is given, its XPath string value is computed and that string is used. The options declared on a step are its declared options. All of the options specified on an atomic step must have been declared. Option names are always expressed as literal values, pipelines cannot construct option names dynamically. The options on a step which have specified values, either because a p:option element specifies a value or because the declaration included a default value, are its specified options. Parameters Some steps accept parameters. Parameters are name/value pairs. A parameter is a name/value pair where the name is an expanded name and the value must be a string. If a document, node, or other value is given, its XPath string value is computed and that string is used. Unlike options, which have names known in advance to the pipeline, parameters are not declared and their names may be unknown to the pipeline author. Pipelines can dynamically construct sets of parameters. Steps can read dynamically constructed sets on parameter input ports. A parameter input port is a distinguished kind of input port which accepts (only) dynamically constructed parameter name/value pairs. See . Analogous to primary input ports, steps that have parameter inputs may designate at most one parameter input port as a primary parameter input port. If a step has a parameter output port which is explicitly marked “primary='true'”, or if it has noexactly one parameter input port and that port is not explicitly marked “primary='false'”, then that parameter input port is the primary name. It inheritsport of the step. If a step has a single parameter input port and that port is explicitly marked “primary='false'”, or if a step has more than one parameter input port and none is explicitly marked as sequencethe primary, then the primary parameter input property of that step is undefined. Additionally, if a p:pipeline does not declare any parameter input ports, but contains a step which has a primary parameter input port, then an implicit primary parameter input port (named “parameters”) will be added to the pipeline. (If the pipeline declares an ordinary input named “parameters”, the implicit primary parameter input port boundwill be named “parameters1”. If that's not available, then “parameters2”, etc. until an available name is found.) How an implementation maps parameters specified to the application, or through some API, to it.parameters accepted by the p:pipeline is implementation-defined.

2.4 Connections

Steps are connected together by their input ports and output ports. It is a static error (err:XS0001) if there are any loops in the connections between steps: no step can be connected to itself nor can there be any sequence of connections through other steps that leads back to itself.

2.4.1 Namespace Fixup on Outputs

XProc processors are expected, and sometimes required, to perform namespace fixup. Unless the semantics of a step explicitly says otherwise:

  • The in-scope namespaces associated with a node (even those that are inherited from namespace bindings that appear among its ancestors in the document in which it appears initially) are assumed to travel with it.

  • Changes to one part of a tree (wrapping or unwrapping a node or renaming an element, for example) do not change the in-scope namespaces associated with the descendants of the node so changed.

As a result, some steps can produce XML documents which have no direct serialization (because they include nodes with conflicting or missing namespace declarations, for example). [Definition: To produce a serializable XML document, the XProc processor must sometimes add additional namespace nodes, perhaps even renaming prefixes, to satisfy the constraints of Namespaces in XML. This process is referred to as namespace fixup.]

Implementors are encouraged to perform namespace fixup before passing documents between steps, but they are not required to do so. Conversely, an implementation which does serialize between steps and therefore must perform such fixups, or reject documents that cannot be serialized, is also conformant.

Except where the semantics of a step explicitly require changes, processors are required to preserve the information in the documents and fragments they manipulate. In particular, the information corresponding to the [Infoset] properties [attributes], [base URI], [children], [local name], [namespace name], [normalized value], [owner], and [parent] must be preserved.

The information corresponding to [prefix], [in-scope namespaces], [namespace attributes], and [attribute type] should be preserved, with changes to the first three only as required for namespace fixup. In particular, processors are encouraged to take account of prefix information in creating new namespace bindings, to minimize negative impact on prefixed names in content.

Except for cases which are specifically called out in Section 7, “Standard Step Library”, the extent to which namespace fixup, and other checks for outputs which cannot be serialized, are performed on intermediate outputs is implementation-defined.

Whenever an implementation serializes pipeline contents, for example for pipeline outputs, logging, or as part of steps such as p:store or p:http-request, it is a dynamic error if that serialization could not be done so as to produce a document which is both well-formed and namespace-well-formed, as specified in XML and Namespaces in XML, regardless of what serialization method, if any, is called for.

2.5 Environment

[Definition: The environment is a context-dependentstep is the collection available to each instance of informationa step available withing sub-pipelines.] Most of the information in the environment is static and can be computed for each subpipeline before evaluation of the pipeline asbegins. The a whole begins. The in-scope options bindings have to be calculated as the pipeline is being evaluated.

The environment consists of:

  1. A set of readable ports. [Definition: The readable ports are a set ofthe step name/port name pairs.pairs that are visible to the step.] Inputs and outputs can only be connected to readable ports.

  2. A defaultset of readable port. [Definition: The default readablein-scope port, are the set which may bethat are undefined, is a specificstep. All of the in-scope options are available to the processor for computing option and parameter values. The actual options passed to a step are those that are declared for a name/portstep of its type and that have values either provided explicitly namewith p:option pair from the setstep or as defaults in the declaration of readablethe step ports.]type.

  3. A set of in-scope bindings.port. [Definition: The in-scopedefault readable port, bindingswhich may be undefined, are a setspecific step name/port name of name-value pairs, based on optionof and variable bindings.]ports.

[Definition: The empty environment contains no readable ports, no in-scope options, and an undefined default readable port and no in-scope bindings.port. ]

Unless otherwise specified, the environment of a contained step is its inherited environment. [Definition: The inherited environment of a contained step is an environment that is the same as the environment of its container with the standard modifications. ]

The standard modifications made to an inherited environment are:

  • All of the specified options of the container are added to the in-scope options. The value of any option in the environment with the same name as one of the options specified on the container is shadowed by the new value. In other words, steps can access the most recently specified value of all of the options specified on any ancestor step. The declared inputs of the container are added to the readable ports.

    In other words, contained steps can see the inputs to their container.

  • The union of all the declared outputs of all of the containers's contained steps are added to the readable ports.

    In other words, sibling steps can see each other's outputs in addition to the outputs visible to their container.

  • If there is a preceding sibling step element:

  • If there is not a preceding sibling step element, the default readable port is the primary input port of the container, if it has one, otherwise the default readable port is unchanged.

  • The names and values from each p:variable present at the beginning of the container are added, in document order, to the in-scope bindings. A new binding replaces an old binding with the same name. See Section 5.7.1, “p:variable for the specification of variable evaluation.

A step with no parent inherits the empty environment.

2.6 XPaths in XProc

XProc uses XPath as an expression language. XPath expressions are evaluated by the XPrococcur processor in several places: on compound steps, to computein the the default values of options and the valuesvalues, and of variables; onvalues atomic steps,passed to compute thesteps. Broadly, actual values of options and the valuestwo of parameters.

XPathclasses: expressions evaluated are also passed to some steps.and These expressions are evaluated by the implementations of the individual steps.

This distinction can be seen in the following example:

<p:variable name="home" select="'http://example.com/docs'"/>

<p:load name="read-from-home">
  <p:with-option name="href" select="concat($home,'/document.xml')"/>
</p:load>

<p:split-sequence name="select-chapters" test="@role='chapter'">name="select-chapters">
  <p:input port="source" select="//section"/>
  <p:option name="test" value="@role='chapter'"/>
</p:split-sequence>

The href option of the p:load step is evaluated by the XProc processor. The actual href option received by the step is simply the string literal “http://example.com/docs/document.xml”. (The selection on the source input of the select-chapters step is also evaluated by the XProc processor.)

The XPath expression “@role='chapter'” is passed literally to the test option on the p:split-sequence step. That's because the nature of the p:split-sequence is that it evaluates the expression. Only some options on some steps expect XPath expressions.

The XProc processor evaluates all of the XPath expressions in select attributes on variables, options, parameters, and inputs,inputs and in match attributes on p:viewport, and steps. (XPath expressions in test attributes onare passed literally to the step p:whenfor steps.

An XProc implementation can use either [XPath 1.0] or [XPath 2.0] to evaluate these expressions. This is a compromise driven entirely by the timing of XProc development. During the development of this specification, the community indicated that it was too early to mandate that all implementations use XPath 2.0 and too late to mandate that all implementations use XPath 1.0.

Many, many expressions that are likely to be used in XProc pipelines are the same in both versions (simple element tests, ancestor and descendant tests, string-based attribute tests, etc.).

As an aid to interoperability, pipeline authors may indicate the version of XPath that they are using. The attribute xpath-version may be used on p:pipeline, p:declare-step, (or p:library) to identify the XPath version that should be used to evaluate XPath expressions on the pipeline(s). The attributeThis is lexically scoped, but see below.

If an xpath-version is specified on on a p:pipelinepipeline or p:declare-step, then that is the version of XPath that the step uses.in If it does not specify a version, but a version is specified on one of its ancestors, the nearest ancestor version specifiedlibrary, is the version that it uses. If no version is specified on the step or among its ancestors, then its XPath versionpipeline is implementation-defined.used

Note

The decision about which XPath version applies can be made dynamically. For example, if a pipeline explicitly labeled with xpath-version “1.0” imports a libraryfor that does not specify a version,otherwise, the implementation may elect to make the implementation-defined XPathdefault version of the steps in the libraryis also “1.0”. If the same implementation imports that library into a pipeline explicitly labled with xpath-version “2.0”, it can make the implementation-defined version of those steps “2.0”.

The following rules determine how the indicated version and the implementation's actual version interact:

  1. If the indicated version and the implementation version are the same, then that version is used.

  2. If the indicated version is 1.0 and the implementation uses XPath 2.0 (or later), the expression must be evaluated in XPath 1.0 compatibility mode. It is a static error (err:XS0046) if the processor does not support XPath 1.0 compatibility mode.

  3. If the indicated version is 2.0 (or later) and the implementation uses XPath 1.0, the implementation must not evaluate any expression that it cannot determine will give the same result in XPath 1.0 that it would have given if XPath 2.0 had been used. It is a static error (err:XS0047) if the processor cannot determine that the expression would yield the same result.

2.6.1 Processor XPath Context

When the XProc processor evaluates an XPath expression using XPath 1.0, unless otherwise indicated by a particular step, it does so with the following context:

context node

The document node of a document. The document is either specified with a binding or is taken from the default readable port. It is a dynamic error (err:XD0008) if a document sequence appears where a document to be used as the context node is expected.

If there is no binding and there is no default readable port then the context node is an empty document node.

context position and context size

The context position and context size are both “1”.

variable bindings

The union of the in-scope options and variables are available as variable bindings to the XPath processor.variables.

function library

The [XPath 1.0] core function library and the Section 2.6.3, “XPath Extension Functions”.

in-scope namespaces

The namespace bindings in-scope on the element where the expression occurred.

When the XProc processor evaluates an XPath expression using XPath 2.0, unless otherwise indicated by a particular step, it does so with the following static context:

XPath 1.0 compatibility mode

Is true if the indicated XPath version is 1.0, false otherwise.

Statically known namespaces

The namespace declarations in-scope for the containing element or made available through p:namespaces.

Default element/type namespace

The null namespace.

Default function namespace

The [XPath 2.0] function namespace.

In-scope schema definitions

None.

In-scope variables

The union of the in-scope options and variables are available as variable bindings to the XPath processor.variables.

Context item static type

Document.

Function signatures
Statically known collations

Implementation defined but must include the Unicode codepoint collation. The version of Unicode supported is implementation-defined, but it is recommended that the most recent version of Unicode be used.

Default collation

Unicode codepoint collation.

Base URI

The base URI of the element on which the expression occurs.

Statically known documents

None.

Statically known collections

None.

And the following dynamic context:

context item

The document node of a document. The document is either specified with a binding or is taken from the default readable port. It is a dynamic error (err:XD0008) if a document sequence appears where a document to be used as the context node is expected.

If there is no binding and there is no default readable port then the context node is undefined.an empty document node.

context position and context size

The context position and context size are both “1”.

Variable values

The union of the in-scope options and variables are available as variable bindings to the XPath processor..

Function implementations
Current dateTime
Implicit timezone

The implicit timezone is implementation defined.

Available documents

The set of available documents (those that may be retrieved with a URI) is implementation dependent.

Available collections

None.

Default collection

None.

2.6.2 Step XPath Context

When a step evaluates an XPath expression using XPath 1.0, it does so with the following context:

When a step evaluates an XPath expression using XPath 2.0, unless otherwise indicated by a particular step, it does so with the following static context:

XPath 1.0 compatibility mode

Is true if the indicated XPath version is 1.0, false otherwise.

Statically known namespaces

The namespace declarations in-scope for the containing element or made available through p:namespaces.

Default element/type namespace

The null namespace.

Default function namespace

The [XPath 2.0] function namespace.

In-scope schema definitions

None.

In-scope variables

None, unless otherwise specified by the step.

Context item static type

Document.

Function signatures

The signatures of the [XPath 2.0 Functions and Operators].XPath 2.0 functions.

Statically known collations

Implementation defined but must include the Unicode codepoint collation.

Default collation

Unicode codepoint collation.

Base URI

The base URI of the element on which the expression occurs.

Statically known documents

None.

Statically known collections

None.

And the following dynamic context:

context item

The document node of the document that appears on the primary input of the step, unless otherwise specified by the step.

context position and context size

The context position and context size are both “1”, unless otherwise specified by the step.

Variable values

None, unless otherwise specified by the step.

Function implementations

The [XPath 2.0 Functions and Operators].XPath 2.0 functions.

Current dateTime

An implementation defined point in time.

Implicit timezone

The implicit timezone is implementation defined.

Available documents

The set of available documents (those that may be retrieved with a URI) is implementation dependent.

Available collections

None.

Default collection

None.

2.6.3 XPath Extension Functions

The XProc processor must support a few additional functions in XPath expressions evaluated by the processor.

In the following descriptions, the names of types (string, boolean, etc.) should be taken to mean the corresponding [W3C XML Schema: Part 2] data types for an implementation that uses XPath 2.0 and as the most appropriate XPath 1.0 types for an XPath 1.0 implementation.

2.6.3.1 System Properties

XPath expressions within a pipeline document can interrogate the processor for information about the current state of the pipeline. Various aspects of the processor are exposed through the p:system-property function in the pipeline namespace:

Function: string p:system-property(string property)

The property string must have the form of a QName; the QName is expanded into a name using the namespace declarations in scope for the expression. The p:system-property function returns the string representing the value of the system property identified by the QName. If there is no such property, the empty string must be returned.

Implementations must provide the following system properties, which are all in the XProc namespace:

p:episode

Returns a string which should be unique for each invocation of the pipeline processor. In other words, if a processor is run several times in succession, or if several processors are running simultaneously, each invocation of each processor should get a distinct value from p:episode.

The unique identifier must consist of ASCII alphanumeric characters and must start with an alphabetic character. Thus, the string is syntactically an XML name.

p:language

Returns a string which identifies the current language, for example, for message localization purposes. The exact format of the language string is implementation defined but should be the same as the xml:lang attribute..

p:product-name

Returns a string containing the name of the implementation, as defined by the implementer. This should normally remain constant from one release of the product to the next. It should also be constant across platforms in cases where the same source code is used to produce compatible products for multiple execution platforms.

p:product-version

Returns a string identifying the version of the implementation, as defined by the implementer. This should normally vary from one release of the product to the next, and at the discretion of the implementer it may also vary across different execution platforms.

p:vendor

Returns a string which identifies the vendor of the processor.

p:vendor-uri

Returns a URI which identifies the vendor of the processor. Often, this is the URI of the vendor's web site.

p:version

Returns the version of XProc implemented by the processor; for processors implementing the version of XProc specified by this document, the value is “1.0”. The value of the version attribute is a token (i.e., an xs:token per [W3C XML Schema: Part 2]).

p:xpath-version

Returns the version of XPath implemented by the processor for evaluating XPath expressions on XProc elements.

2.6.3.2 Step Available

The p:step-available function reports whether or not a particular type of step is understood by the processor.

Function: boolean p:step-available(string step-type)

The step-type string must have the form of a QName; the QName is expanded into a name using the namespace declarations in scope for the expression. The p:step-available function returns true if and only if the processor knows how to evaluate steps of the specified type.

2.6.3.3 Iteration Position

In the context of a p:for-each or a p:viewport, the p:iteration-position function reports the position of the document being processed in the sequence of documents that will be processed. In the context of other standard XProc compound steps, it returns 1.

Function: integer p:iteration-position()

In the context of an extension compound step, the value returned by p:iteration-position is implementation-defined.

2.6.3.4 Iteration Size

In the context of a p:for-each or a p:viewport, the p:iteration-size function reports the number of documents in the sequence of documents that will be processed. In the context of other standard XProc compound steps, it returns 1.

Function: integer p:iteration-size()

In the context of an extension compound step, the value returned by p:iteration-size is implementation-defined.

2.6.3.5 Base URI

Returns the base URI of the specified node, if it has one. The semantics of this function are the same as the semantics of the XPath 2.0 fn:base-uri() function.

Function: string p:base-uri()

Function: string p:base-uri(Node node)

If no node is specified, the base URI of the context node is returned.

2.6.3.6 Resolve URI

Resolves a relative URI with respect to a particular base URI. The semantics of this function are the same as the semantics of the XPath 2.0 fn:resolve-uri() function.

Function: string p:resolve-uri(String relative)

Function: string p:resolve-uri(String relative, String base)

If no base is specified, the base URI of the context node is used.

2.6.3.7 Other XPath Extension Functions

It is implementation defined if the processor supports any other XPath extension functions.

2.7 Variables

Variables are name/value pairs. Pipeline authors can create variables to hold computed values.

[Definition: A variable is a name/value pair where the name is an expanded name and the value must be a string.] If a document, node, or other value is given, its XPath string value is computed and that string is used.

Variables and options share the same scope and may shadow each other.

2.8 Options

Some steps accept options. Options are name/value pairs, like variables. Unlike variables, the value of an option can be changed by the caller.

[Definition: An option is a name/value pair where the name is an expanded name and the value must be a string.] If a document, node, or other value is given, its XPath string value is computed and that string is used.

[Definition: The options declared on a step are its declared options.] Option names are always expressed as literal values, pipelines cannot construct option names dynamically.

[Definition: The options on a step which have specified values, either because a p:with-option element specifies a value or because the declaration included a default value, are its specified options.]

2.9 Parameters

Some steps accept parameters. Parameters are name/value pairs, like variables and options. Unlike variables and options, which have names known in advance to the pipeline, parameters are not declared and their names may be unknown to the pipeline author. Pipelines can dynamically construct sets of parameters. Steps can read dynamically constructed sets on parameter input ports.

[Definition: A parameter is a name/value pair where the name is an expanded name and the value must be a string.] If a document, node, or other value is given, its XPath string value is computed and that string is used.

[Definition: A parameter input port is a distinguished kind of input port which accepts (only) dynamically constructed parameter name/value pairs.] See Section 5.1.2, “Parameter Inputs”.

Analogous to primary input ports, steps that have parameter inputs may designate at most one parameter input port as a primary parameter input port.

[Definition: If a step has a parameter input port which is explicitly marked “primary='true'”, or if it has exactly one parameter input port and that port is not explicitly marked “primary='false'”, then that parameter input port is the primary parameter input port of the step.] If a step has a single parameter input port and that port is explicitly marked “primary='false'”, or if a step has more than one parameter input port and none is explicitly marked as the primary, then the primary parameter input port of that step is undefined.

2.10 Security Considerations

An XProc pipeline may attempt to access arbitrary network resources: steps such as p:load and p:http-request can attempt to read from an arbitrary URI; steps such as p:store can attempt to write to an arbitrary location; p:exec can attempt to execute an arbitrary program. Note, also, that some steps, such as p:xslt and p:xquery, include extension mechanisms which may attempt to execute arbitrary code. location.

In some environments, it may be inappropriate to provide the XProc pipeline with access to these resources. In a server environment, for example, it may be impractical to allow pipelines to store data. In environments where the pipeline cannot be trusted, allowing the pipeline to access arbitrary resources or execute arbitrary code may be a security risk.

It is a dynamic error (err:XD0021) for a pipeline to attempt to access a resource for which it has insufficient privileges or perform a step which is forbidden.

A conformant XProc processor may limit the resources available to any or all steps in a pipeline. A conformant implementation may raise dynamic errors, or take any other corrective action, for any security problems that it detects.

2.11 Versioning Considerations

A pipeline author may identify the version of XProc against which a particular pipeline was authored by explicitly importing the library that identifies the steps defined by that version of XProc. For the version defined by this specification, the library is “http://www.w3.org/2008/xproc-1.0.xpl”.

If the version is not explicitly identified, the implicit version should be the most recent version known to the processor.

When a processor encounters a version it does not recognize, it proceeds in forwards-compatible mode. In forwards-compatible mode:

  1. The library that identifies the version of XProc is imported, see p:import. This provides the processor with declarations for any new step types.

  2. It is a dynamic error to attempt to evaluate a step type for which no implementation is known, but conditional processing and the step-available function can be used to write backwards-compatible pipelines.

  3. It is a static error if the signature of a known step in the version library has changed, except for new options.

  4. New options on known steps are ignored in the pipeline.

As a consequence, future specifications must not change the semantics of existing step types without changing their names.

3 Syntax Overview

This section describes the normative XML syntax of XProc. This syntax is sufficient to represent all the aspects of a pipeline, as set out in the preceding sections. [Definition: XProc is intended to work equally well with [XML 1.0] and [XML 1.1]. Unless otherwise noted, the term “XML” refers equally to both versions.] [Definition: Unless otherwise noted, the term Namespaces in XML refers equally to [Namespaces 1.0] and [Namespaces 1.1].] Support for pipeline documents written in XML 1.1 and pipeline inputs and outputs that use XML 1.1 is implementation-defined.

Elements in a pipeline document represent the pipeline, the the steps it contains, the connections between those steps, the steps and and connections contained within them, and so on. Each step is represented by an element; a combination of elements and attributes specify how how the inputs and outputs of each step are connected and how options and parameters parameters are passed.

Conceptually, we can speak of steps as objects that have inputs and outputs, that are connected together and which may contain additional steps. Syntactically, we need a mechanism for specifying these relationships.

Containment is represented naturally using nesting of XML elements. If a particular element identifies a compound step then the step elements that are its immediate children form its subpipeline.

The connections between steps are expressed using names and references to those names.

Six kinds of things are named in XProc:

  1. Step types,
  2. Steps,
  3. Input ports (both parameter and document),
  4. Output ports,
  5. Options and variables,Options, and
  6. Parameters

3.1 XProc Namespaces

ThereThe XML are four namespaces associated with XProc:

http://www.w3.org/ns/xproc

The namespace of the XProc XML vocabulary described by this specification; by convention, the namespace prefix “p:” is used for this namespace.

http://www.w3.org/ns/xproc-step

The namespace used for documents that are inputs to and outputs from several standard and optional steps described in this specification. Some steps, such as p:http-request and p:store, have defined input or output vocabularies. We use this namespace for all of those documents. The conventional prefix “c:” is used for this namespace.

http://www.w3.org/ns/xproc-error

The namespace used for errors. The conventional prefix “err:” is used for this namespace.

3.2 Scoping of Names

Names are used to identify step types, steps, ports, options and variables, and parameters. Step types, options, variables, and parameters are named with QNames. Steps and ports are named with NCNames. The scope of a name is a measure of where it is available in a pipeline. [Definition: If two names are in the same scope, we say that they are visible to each other. ]

The scope of the names of the step types is the union of all the pipelines and pipeline libraries available directly or via p:import. Step In other words, the step types visible in a pipeline or library are:

  • The standard, built-in types (p:pipeline, p:choose, etc.).

  • Any implementation-provided types.

  • Thep:declare-step types visiblep:xslt, inp:xinclude, any library that is imported.etc.)

  • The step types declared in the pipeline or library. p:pipeline.

  • The pipelines that are imported.

  • Forby a pipeline, the pipeline itself.

  • For a pipeline in a library, the types visible in the containing library. processor.

All the step types in a pipeline must have unique names: it is a static error (err:XS0036) if any step type name is built-in and/or declared or defined more than once in the same scope.

The scope of the names of the steps themselves is determined by the environment of each step. In general, the name of a step, the names of its sibling steps, the names of any steps that it contains directly, the names of its ancestors, and the names of theits ancestor's siblings of its ancestors are all in a common scope. All the named steps in the same scope must have unique names: it is a static error (err:XS0002) if two steps with the same name appear in the same scope.

The scope of an input or output port name is the step on which it is defined. The names of all the ports on any step must be unique.

Taken together, these uniqueness constraints guarantee that the combination of a step name and a port name uniquely identifies exactly one port on exactly one in-scope step.

The scope of option and variable names is determined by where theyon are declared. When an option is declared with p:optionand (or a variable with p:variable),step. unless otherwise specified, itsof scope consistsall of the sibling elements that follow its declaration and the descendants of those siblings.

Parametera names are not scoped; they are distinct on each step.

3.3 Base URIs and xml:base

Whenspecifies a relative URI appears in an option value, the base URI againstname which it mustoption be made absolute is the baseits URI ofancestors, the p:option element. If an optionnew value is specified using a syntactic shortcut,shadows the base URI of the stepvalue on which the shortcut attribute appears mustcurrent be used. In general, whenever a relativeand URI appears, its base URI is the base URI of the nearest ancestor element.descendants.

The pipeline author can control the base URIs of elements withinstep. theGlobal pipeline document with the xml:base attribute. Attributes The xml:basefollowing attribute may appear on any element in a pipelinepipeline: The attribute andxml:id has the semantics outlined in [XML Base].

3.4 Unique identifiers

A pipeline author can provide a globally unique identifier for any element in a pipeline with the xml:id attribute.

The xml:id attribute may appearxml:base on any element in a pipeline andwith has the semantics outlined in [xml:id].

3.5 Associating Documents with Ports

[Definition: A binding associates an input or output port with some data source.] A document or a sequence of documents can be bound to a port in four ways: by source, by URI, by providing an inline document, or by making it explicitly empty. Each of these mechanisms is allowed on the p:input, p:output, p:xpath-context, p:iteration-source, and p:viewport-source elements.

Specified by URI

[Definition: A document is specified by URI if it is referenced with a URI.] The href attribute on the p:document element is used to refer to documents by URI.

In this example, the input to the p:identity step named “otherstep” comes from “http://example.com/input.xml”.

<p:output port="result"/>

<p:identity name="otherstep">
  <p:input port="source">
    <p:document href="http://example.com/input.xml"/>
  </p:input>
</p:identity>

It is a dynamic error (err:XD0002) if the processor attempts to retrieve the URI specified on a p:document and fails. (For example, if the resource does not exist or is not accessible with the user's authentication credentials.)

Specified by source

[Definition: A document is specified by source if it references a specific port on another step.] The step and port attributes on the p:pipe element are used for this purpose.

In this example, the “source” input to the p:xinclude step named “expand” comes from the “result” port of the step named “otherstep”.

<p:xinclude name="expand">
  <p:input port="source">
    <p:pipe step="otherstep" port="result"/>
  </p:input>
</p:xinclude>

When a p:pipe is used, the specified port must be in the readable ports of the current environment. It is a static error (err:XS0003) if the port specified by a p:pipe is not in the readable ports of the environment.

Specified inline

[Definition: An inline document is specified directly in the body of the element that binds it.] The content of the p:inline element is used for this purpose.

In this example, the “stylesheet” input to the XSLT step named “xform” comes from the content of the p:input element itself.

<p:xslt name="xform">
  <p:input port="stylesheet">
    <p:inline>
      <xsl:stylesheet version="1.0">
        ...
      </xsl:stylesheet>
    </p:inline>
  </p:input>
</p:xslt>

Inline documents are considered “quoted”. The pipeline processor passes them literally to the port, even if they contain elements from the XProc namespace or other namespaces that would have other semantics outside of the p:inline.

Specified explicitly empty

[Definition: An empty sequence of documents is specified with the p:empty element.]

In this example, the “source” input to the XSLT 2.0 step named “generate” is explicitly empty:

<p:xslt name="generate" version="2.0">
  <p:input port="source">
    <p:empty/>
  </p:input>
  <p:input port="stylesheet">
    <p:inline>
      <xsl:stylesheet version="2.0">
        ...
      </xsl:stylesheet>
    </p:inline>
  </p:input>
  <p:with-option name="template-name" select="'someName'"/>
</p:xslt>

If you omit the binding on a primary input port, a binding to the default readable port will be assumed. Making the binding explicitly empty guarantees that the binding will be to an empty sequence of documents.

It is inconsistent with the [XPath 1.0] specification to specify an empty binding as the context for evaluating an XPath expression. When an empty binding is specified for an XPath 1.0 expression, an empty document node must be used instead as the context node.

Note that a p:input or p:output element may contain more than one p:pipe, p:document, or p:inline element. If more than one binding is provided, then the specified sequence of documents is made available on that port in the same order as the bindings.

3.6 Documentation

Pipeline authors may add documentation to their pipeline documents with the p:documentation element. Except when it appears as a descendant of p:inline, the p:documentation element is completely ignored by pipeline processors, it exists simply for documentation purposes. (If a p:documentation is provided as a descendant of p:inline, it has no special semantics, it is treated literally as part of the document to be provided on that port.)

Pipeline processors that inspect the contents of p:documentation elements and behave differently on the basis of what they find are not conformant. Processor extensions must be specified with extension elements. p:pipeinfo.

3.7 Processor annotations

PipelineIn order to facilitate extension elements, authorsthe processor may add annotations to theirignore elements from pipeline documents withAny element in an ignored thenamespace is p:pipeinfoan element.ignorable element. TheIf a processor encounters an ignorable element as the semantics of a p:pipeinfop:pipeline or p:pipeline-library elementsthen it behaves in are implementation-defined. manner if it Processors the element, otherwise it should specifybehave as if the element (and its content) had not been present. Syntactically, a way for their annotations to be identified, perhapsignored namespaces with extensionthe ignore-prefixes attributes.

Whereattribute. p:documentationThis isattribute can intended for human consumption, p:pipeinfo elementsand p:pipeline-library elements. The are intended for processorignore-prefixes consumption. Ais a sequence of tokens, processor might, for example, use annotations to identifyof an some particular aspectIt is a static error if of antoken specified implementation, to request additional, perhaps non-standard features, to describe parallelism constraints, etc.namespace.

WhenIgnored namespaces specified on a p:pipeinfo are appearsinherited by pipelines that occur within that library. It as a descendantstatic oferror p:inline,to it has no specialan semantics; in that context itnamespace, the must benamespace treated literally as partp:pipeline, or of the document to be providedatomic onstep thatis declared. port.

3.8 Extension attributes

[Definition: An element from the XProc namespace may have any attribute not from the XProc namespace, provided that the expanded-QName of the attribute has a non-null namespace URI. Such an attribute is called an extension attribute.] Extension attributes are always allowed and do not have to be declared with ignored namespaces.

The presence of an extension attribute must not cause the connections between steps to differ from the connections that would arise in the absence of the attribute.produce. They must not cause the processor to fail to signal an error that woulda conformant processor is required to signal. This means that an extension attribute be signalled in the absence of any XProc element except to the extent that the attribute.effect is implementation-defined or implementation-dependent.

A processor which encounters an extension attribute that it does not recognize must behave as if the attribute was not present.

Extension elements An extension element is any element that is not in the XProc namespace and is not a step. The presence of an extension element must not cause the connections between steps to differ from the connections that any other conformant XProc processor would produce. They must not cause the processor to fail to signal an error that a conformant processor is required to signal. This means that an extension element must not change the effect of any XProc element except to the extent that the effect is implementation-defined or implementation-dependent. An element is only an extension element if it is an ignorable element that occurs as a direct child of a p:pipeline or p:pipeline-library. In other words, elements in a subpipeline are interpreted as follows: In XProc namespace? Names a built-in compound step? Check against grammar, interpret per spec. Names a built-in atomic step? Check against grammar and built-in declaration, interpret per spec. Otherwise, error. Is in ignorable namespace? Is a known extension? Process as appropriate. Otherwise, ignore. Names a declared step type? Check against grammar and supplied step declaration, interpret per spec. Names a defined pipeline? Check against pipeline definition, interpret per spec. Otherwise, error.

3.9 Syntax Summaries

The description of each element in the pipeline namespace is accompanied by a syntactic summary that provides a quick overview of the element's syntax:

<p:some-element
  some-attribute? = some-type>
    (some |
     elements |
     allowed)*,
    other-elements?
</p:some-element>

For clarity of exposition, some attributes and elements are elided from the summaries:

The types given for attributes should be understood as follows:

  • ID, NCName, NMTOKEN, NMTOKENS, anyURI, boolean, integer, string: As per [W3C XML Schema: Part 2] including whitespace normalization as appropriate.

  • QName: With whitespace normalization as per [W3C XML Schema: Part 2] and according to the following definition: [Definition: In the context of XProc, a QName is almost always a QName in the Namespaces in XML sense. Note, however, that p:option and p:with-param values can get their namespace declarations in a non-standard way (with p:namespaces) and QNames that have no prefix are always in no-namespace, irrespective of the default namespace.]

  • PrefixList: As a list with [item type] NMTOKEN, per [W3C XML Schema: Part 2], including whitespace normalization.

  • XPathExpression, XSLTMatchPattern: As a string per [W3C XML Schema: Part 2], including whitespace normalization, and the further requirement to be a conformant Expression per [XPath 1.0] or [XPath 2.0], as appropriate, or Match pattern per [XSLT 1.0] or [XSLT 2.0], as appropriate.respectively.

A number of errors apply generally:

If an XProc processor can determine statically that a dynamic error will always occur, it may report that error staticallystatically. Steps This provided that the core steps of XProc. error Every compound step in a pipeline has several parts: a set of doesinputs, nota occurset of among the descendants of options, a p:try.set of Errorscontained insidesteps, and an environment. Except where a p:trynoted, must always be raised dynamically so that p:catchnumber processing may be performed on them.

Steps

ThisIt is a static error section describesa compound the core steps ofcontained XProc.steps.

4.1 p:pipeline

A p:pipeline declares a pipeline that can be evaluatedspecified by an XProcp:pipeline processor. It encapsulates the behavior of a subpipeline. Its children declare the inputs, outputs, and options that the pipeline exposes and identify the steps in its subpipeline. A (A p:pipelinecan is a simplified form of stepones declaration.)that

Allare p:pipelineprovided pipelines have an implicit primaryimplementation input portin named “sourcesome and an implicit primary output portimplementation-defined named “result”.way) Any input orother pipelines. output ports that the p:pipeline declaresbeen explicitly are inmay additionbe to those ports andstep within may not be declared primary.

<p:pipeline
  name? = NCName
  type? = QName
  psvi-required? = boolean
  xpath-version? = string>
    (p:input |
     p:output |
     p:option |
     p:log |
     p:serialization)*,
    ((p:declare-step |
      p:import)*,
     subpipeline)?
</p:pipeline>

Viewed from the outside, a p:pipeline is a black box which performs some calculation on its inputs and produces its outputs. From the pipeline author's perspective, the computation performed by the pipeline is described in terms of contained steps which read the pipeline's inputs and produce the pipeline's outputs.

The environment inherited by the contained steps of a p:pipeline is the empty environment with these modifications: All of the declared inputs of the pipeline are added to the readable ports in the environment.

If the pipeline has a primary input port, that input is the default readable port, otherwise the default readable port is undefined. All of the declared options of the pipeline doesare added to the in-scope options in the environment. If the notp:pipeline have a typeprimary output port and that port has no binding, then thatit is bound to the primary output port of the last step in the subpipeline. It is a static error if the primary output port has no binding and the last step pipeline cannot be invokeddoes not as a step.primary output port. There are two additional constraints on pipelines:

The p:pipeline elementmust not itself be a contained step. If a p:pipeline is justpart of a p:pipeline-library or if it simplifiedis imported form of stepp:import, then it must have declaration.a name A document thattype or reads:

<p:pipelineIf the pipeline initially invoked by the processor has inputs or
outputs, those ports are bound to documents outside of the pipeline some-attributes>
  some-contentin
</p:pipeline>
an implementation-defined manner.

canIf a pipeline has a type then that type may be interpreted as ifthe name of a step to invoke the pipeline. This most often occurs when it read:

<p:declare-stephas been imported into another pipeline,
but pipelines may also invoke themselves some-attributes>
  <p:inputrecursively.
If port='source' primary='true'/>
  <p:inputdoes port='paramters' kind='parameters' primary='true'/>
  <p:outputa port='result'type, primary='true'>
  some-contentthen
</p:declare-step>
its name is used to invoke it as a step.

SeeFor pipelines that are part of a p:declare-step, see for more details.details on how p:pipeline names are used to compute step names.

4.1.1 Example

A pipeline might accept a document and a stylesheet as input; perform XInclude, validation, and transformation; and produce the transformed document as its output.

Example 4. A Sample Pipeline Document
<p:pipeline name="pipeline" xmlns:p="http://www.w3.org/ns/xproc">
<p:input port="document" primary="true"/>
<p:input port="stylesheet"/>
<p:output port="result" primary="true"/>

<p:xinclude/>

<p:validate-with-xml-schema>
  <p:input port="schema">
    <p:document href="http://example.com/path/to/schema.xsd"/>
  </p:input>
</p:validate-with-xml-schema>

<p:xslt>
  <p:input port="stylesheet">
    <p:document<p:pipe step="pipeline" href="http://example.com/path/to/stylesheet.xsl"/>
  </p:input>
</p:xslt>

</p:pipeline>

4.2 p:for-each

A for-each is specified by the p:for-each element. It is a compound step that processes a sequence of documents, applying its subpipeline to each document in turn.

<p:for-each
  name? = NCName>
    ((p:iteration-source? &
      (p:output |
       p:log)*),
     subpipeline)
</p:for-each>

When a pipeline needs to process a sequence of documents using a subpipeline that only processes a single document, the p:for-each construct can be used as a wrapper around that subpipeline. The p:for-each will apply that subpipeline to each document in the sequence in turn.

The result of the p:for-each is a sequence of documents produced by processing each individual document in the input sequence. If the p:for-each has one or more output ports, what appears on each of those ports is the sequence of documents that is the concatenation of the sequence produced by each iteration of the loop on the port to which it is connected. If the iteration source for a p:for-each is an empty sequence, then the subpipeline is never run and an empty sequence is produced on all of the outputs.

The p:iteration-source is an anonymous input: its binding provides a sequence of documents to the p:for-each step. If no iteration sequence is explicitly provided, then the iteration source is read from the default readable port.

A portion of each input document can be selected using the select attribute. If no selection is specified, the document node of each document is selected. Each subtree selected by the p:iteration-source is wrapped in a document node (unless it is a document) and provided to the subpipeline. The processor provides each document, one at a time, to the subpipeline represented by the children of the p:for-each on a port named current.

For each declared output, the processor collects all the documents that are produced for that output from all the iterations, in order, into a sequence. The result of the p:for-each on that output is that sequence of documents.

The environment inherited by the contained steps of a p:for-each is the inherited environment with these modifications:

If the p:for-each has a primary output port (explicit or supplied by default) and that port has no binding, then it is bound to the primary output port of the last step in the subpipeline. It is a static error (err:XS0006) if the primary output port has no binding and the last step in the subpipeline does does not have a primary output port.

Note that outputs declared for a p:for-each serve a dual role. Inside the p:for-each, they are used to read results from the subpipeline. Outside the p:for-each, they provide the aggregated results.

The sequence attribute on a p:output inside a p:for-each only applies inside the step. From the outside, all of the outputs produce sequences.

4.2.1 XPath Context

Within a p:for-each, the p:iteration-position and p:iteration-size are taken from the sequence of documents that will be processed by the p:for-each. The total number of documents is the p:iteration-size; size; the ordinal value of the current document (the document appearing on the current port) is the p:iteration-position. position.

Note to implementers

In the case where no XPath expression that must be evaluated by the processor makes any reference to p:iteration-size, its value does not actually have to be calculated (and the entire input sequence does not, therefore, need to be buffered so that its size can be calculated before processing begins).

4.2.2 Example

A p:for-each might accept a sequence of chapters as its input, process each chapter in turn with XSLT, a step that accepts only a single input document, and produce a sequence of formatted chapters as its output.

Example 5. A Sample For-Each
<p:for-each name="chapters">
  <p:iteration-source select="//chapter"/>
  <p:output port="html-results">
    <p:pipe step="make-html" port="result"/>
  </p:output>
  <p:output port="fo-results">
    <p:pipe step="make-fo" port="result"/>
  </p:output>

  <p:xslt name="make-html">
    <p:input port="stylesheet">
      <p:document href="http://example.com/xsl/html.xsl"/>
    </p:input>
  </p:xslt>

  <p:xslt name="make-fo">
    <p:input port="source">
      <p:pipe step="chapters" port="current"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="http://example.com/xsl/fo.xsl"/>
    </p:input>
  </p:xslt>
</p:for-each>

The //chapter elements of the document are selected. Each chapter is transformed into HTML and XSL Formatting Objects using an XSLT step. The resulting HTML and FO documents are aggregated together and appear on the html-results and fo-results ports, respectively, of the chapters step itself.

4.3 p:viewport

A viewport is specified by the p:viewport element. It is a compound step that processes a single document, applying its subpipeline to one or more subtrees of the document.

<p:viewport
  name? = NCName
  match = XSLTMatchPattern>
    ((p:viewport-source? &
      p:output? &
      p:log?),
     subpipeline)
</p:viewport>

The result of the p:viewport is a copy of the original document where the selected subtrees have beensubsections replaced by the results of applying the subpipeline to them.

The p:viewport-source is an anonymous input: its its binding provides a single document to the p:viewport step. If no document is explicitly provided, then the viewport source is read from the default readable port. It is a dynamic error (err:XD0003) if the viewport source does not provide exactly one document.

The match attribute specifies an XSLTXPath expression that is matcha Pattern in pattern.. Each matching node in the source document is wrapped in a document node, as necessary,node and provided, one at a time,provided to the viewport's subpipeline on a port named current. The base URIprocessor of the resulting document that is passed to the subpipeline subpipeline is the base URIchildren of the matched p:viewport element or document. Itport isnamed current. What a dynamicon error ifthe the matchfrom expression on p:viewport doeswill be a copy not match an element or document.

Aftereach a match is replaced found, the entire subtree rooted at that match isto processed assubtree rooted a unit. Nonode. It further attempts aredynamic madeerror if to match nodes amongsource the descendants of any matched node.document.

The environment inherited by the contained steps of a p:viewport is the inherited environment with these modifications:

TheIf the p:viewport must contain a single, primary output port explicit declared explicitly or supplied by default. Ifthat that port has no binding, then it is bound to the primary output port of the last step in the subpipeline. It is a static error (err:XS0006) if the primary output port has no binding and the last step in the subpipeline does not have a primary output port.

What appears on the output from the p:viewport will be a copy of the input document where each matching node is replaced by the result of applying the subpipeline to the subtree rooted at that node. In other words, if the match pattern matches a particular element then that element is wrapped in a document node and provided on the current port, the subpipeline in the p:viewport is evaluated, and the result that appears on the output port replaces the matched element.

If no documents appear on the output port, the matched element will effectively be deleted. If exactly one document appears, the contents of that document will replace the matched element. If a sequence of documents appears, then the contents of each document in that sequence (in the order it appears in the sequence) will replace the matched element.

The output of the p:viewport itself is a single document that appears on a port named “result”. Note that the semantics of p:viewport are special. The output port in the p:viewport is used only to access the results of the subpipeline. The output of the step itself appears on a port with the fixed name “result” that is never explicitly declared.

4.3.1 XPath Context

Within a p:viewport, the p:iteration-position and p:iteration-size are taken from the sequence of documents that will be processed by the p:viewport. The total number of documents is the p:iteration-size; size; the ordinal value of the current document (the document appearing on the current port) is the p:iteration-position. position.

Note to implementers

In the case where no XPath expression that must be evaluated by the processor makes any reference to p:iteration-size, its value does not actually have to be calculated (and the entire input sequence does not, therefore, need to be buffered so that its size can be calculated before processing begins).

4.3.2 Example

A p:viewport might accept an XHTML document as its input, add an hr element at the beginning of all div elements that have the class value “chapter”, and return an XHTML document that is the same as the original except for that change.

Example 6. A Sample Viewport
<p:viewport match="h:div[@class='chapter']"
            xmlns:h="http://www.w3.org/1999/xhtml">
  <p:insert position="first-child">
    <p:input port="insertion">
      <p:inline>
        <hr xmlns="http://www.w3.org/1999/xhtml"/>
      </p:inline>
    </p:input>
  </p:insert>
</p:viewport>

The nodes which match h:div[@class='chapter'] (according to the rules of ) in the input document are selected. An hr is inserted as the first child of each h:div and the resulting version replaces the original h:div. The result of the whole step is a copy of the input document with a horizontal rule as the first child of each selected h:div.

4.4 p:choose

A choose is specified by the p:choose element. It is a multi-container step that selects exactly one of a list of alternative subpipelines based on the evaluation of XPath expressions.

<p:choose
  name? = NCName>
    (p:xpath-context?,
     p:variable*,
     p:when*,
     p:otherwise?)
</p:choose>

A p:choose has no inputs. It contains an arbitrary number of alternative subpipelines, exactly one of which will be evaluated.

The list of alternative subpipelines consists of zero or more subpipelines guarded by an XPath expression, followed optionally by a single default subpipeline.

The p:choose considers each subpipeline in turn and selects the first (and only the first) subpipeline for which the guard expression evaluates to true in its context. If there are no subpipelines for which the expression evaluates to true, the default subpipeline, if it was specified, is selected.

After a subpipeline is selected, it is evaluated as if only it had been present.

The outputs of the p:choose are taken from the outputs of the selected subpipeline. The p:choose has the same number of outputs as the selected subpipeline with the same names. If the selected subpipeline has a primary output port, the port with the same name on the p:choose is also a primary output port.

In order to ensure that the output of the p:choose is consistent irrespective of the subpipeline chosen, each subpipeline must declare the same number of outputs with the same names and the same settings withnames. respect to sequences. If any of the subpipelines specifies a primary output port, each subpipeline must specify exactly the same output as primary. It is a static error (err:XS0007) if two subpipelines in a p:choose declare different outputs.

It is a dynamic error (err:XD0004) if no subpipeline is selected by the p:choose and no default is provided.

The p:choose can specify the context node against which the XPath expressions that occur on each branch are evaluated. The context node is specified as a binding for the p:xpath-context. If no binding is provided, the default p:xpath-context is the document on the default readable port.

Each conditional subpipeline is represented by a p:when element. The default branch is represented by a p:otherwise element.

4.4.1 p:xpath-context

A p:xpath-contextXPath element specifies the context against which an XPath expression will be evaluated. When it appears infor a p:when, it specifies the context for that p:when’s test attribute. When it appears in p:choose, it specifies the default context for all of the p:when elements in that. p:choose.

<p:xpath-context>
    (p:empty |
      p:pipe |
      p:document |
      p:inline)?
</p:xpath-context>

Only one binding is allowed and it works the same way that bindings work on a p:input. No select expression is allowed. It is a dynamic error (err:XD0005) if the xpath-context is bound to a sequence of documents.

In an XPath 1.0 implementation, ifIf the context node is bound to p:empty, or is unbound and the default readable port is undefined, an empty document node is used instead as the context. In an XPath 2.0 implementation, the context item is undefined.

4.4.2 p:when

A when specifies one subpipeline guarded by a test expression.

<p:when
  test = XPathExpression>
    (
p:xpath-context?,
     (p:output |
      p:log)*,
     subpipeline)
</p:when>

Each p:when branch of the p:choose has a test attribute which must contain an XPath expression. That XPath expression's effective boolean value is the guard expression for the subpipeline contained within that p:when.

It is a dynamic error if the value of the test attribute is not a valid XPath expression. The p:when can specify a context node against which its test expression is to be evaluated. That context node is specified as a binding for the p:xpath-context. If no context is specified on the p:when, the context of the p:choose is used.

4.4.3 p:otherwise

An otherwise specifies the default branch; the subpipeline selected if no test expression on any preceding p:when evaluates to true.

<p:otherwise>
    ((p:output |
      p:log)*,
     subpipeline)
</p:otherwise>

4.4.4 Example

A p:choose might test the version attribute of the document element and validate with an appropriate schema.

Example 7. A Sample Choose
<p:choose name="version">
  <p:when test="/*[@version = 2]">
    <p:validate-with-xml-schema>
      <p:input port="schema">
        <p:document href="v2schema.xsd"/>
      </p:input>
    </p:validate-with-xml-schema>
  </p:when>

  <p:when test="/*[@version = 1]">
    <p:validate-with-xml-schema>
      <p:input port="schema">
        <p:document href="v1schema.xsd"/>
      </p:input>
    </p:validate-with-xml-schema>
  </p:when>

  <p:when test="/*[@version]">
    <p:identity/>
  </p:when>

  <p:otherwise>
    <p:output port="result">
      <!-- this output is necessary so that all the branches have
           the same outputs; it'll never really matter because
           we're just about to raise an error. -->
      <p:inline>
        <nop/>
      </p:inline>
    </p:output>
    <p:error code="NOVERSION">
      <p:input port="source">
     <p:inline>
       <message>Requiredcode="NOVERSION"
        description="Required version attribute missing.</message>
     </p:inline>
      </p:input>
    </p:error>missing."/>
  </p:otherwise>
</p:choose>

4.6 p:try

A try/catch is specified by the p:try element. It is a multi-container step that isolates a subpipeline, preventing any dynamic errors that arise within it from being exposed to the rest of the pipeline.

<p:try
  name? = NCName>
    (p:variable*,
      p:group,
      p:catch)
</p:try>

The p:group represents the initial subpipeline and the recovery (or “catch”) pipeline is identified with a p:catch element.

The p:try step evaluates the initial subpipeline and, if no errors occur, the outputs of that pipeline are the outputs of the p:try step. However, if any errors occur, the p:try abandons the first subpipeline, discarding any output that it might have generated, and evaluates the recovery subpipeline.

If the recovery subpipeline is evaluated, the outputs of the recovery subpipeline are the outputs of the p:try step. If the recovery subpipeline is evaluated and a step within that subpipeline fails, the p:try fails.

The outputs of the p:try are taken from the outputs of the initial subpipeline or the recovery subpipeline if an error occurred in the initial subpipeline. The p:try has the same number of outputs as the selected subpipeline with the same names. If the selected subpipeline has a primary output port, the port with the same name on the p:try is also a primary output port.

In order to ensure that the output of the p:try is consistent irrespective of whether the initial subpipeline provides its output or the recovery subpipeline does, both subpipelines must declare the same number of outputs with the same names and the same settings with respect to sequences.names. If either of the subpipelines specifies a primary output port, both subpipelines must specify exactly the same output as primary. It is a static error (err:XS0009) if the p:group and p:catch subpipelines declare different outputs.

A pipeline author can cause an error to occur with the p:error step.

The recovery subpipeline of a p:try is identified with a p:catch:

<p:catch>
    ((p:output |
      p:log)*,
     subpipeline)
</p:catch>

The environment inherited by the contained steps of the p:catch is the inherited environment with this modification:

What appears on the error output port is an error document. The error document may contain messages generated by steps that were part of the initial subpipeline. Not all messages that appear are indicative of errors; for example, it is common for all xsl:message output from the XSLT component to appear on the error output port. It is possible that the component which fails may not produce any messages at all. It is also possible that the failure of one component may cause others to fail so that there may be multiple failure messages in the document.

4.6.1 The Error Vocabulary

In general, it is very difficult to predict error behavior. Step failure may be catastrophic (programmer error), or it may be be the result of user error, resource failures, etc. Steps may detect more than one error, and the failure of one step may cause other steps to fail as well.

The p:try/p:catch mechanism gives pipeline authors the opportunity to process the errors that caused the p:try to fail. In order to facilitate some modicum of interoperability among processors, errors that are reported on the error output port of a p:catch should conform to the format described here.

4.6.1.3 Error Example

Consider the following XSLT stylesheet:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

<xsl:template match="/">
  <xsl:message terminate="yes">
    <xsl:text>This stylesheet is </xsl:text>
    <emph>pointless</emph>
    <xsl:text>.</xsl:text>
  </xsl:message>
</xsl:template>

</xsl:stylesheet>

If it was used in a step named “xform” in a p:try, the following error document might be produced:

<c:errors xmlns:c="http://www.w3.org/2007/03/xproc-step">
  <c:error name="xform" type="p:xslt"
             href="style.xsl" line="6">This stylesheet is <emph>pointless</emph>.</c:error>
</c:errors>

It is not an error for steps to generate non-standard error output as long as it is well-formed.

4.6.2 Example

A pipeline might attempt to process a document by dispatching it to some web service. If the web service succeeds, then those results are passed to the rest of the pipeline. However, if the web service cannot be contacted or reports an error, the p:catch step can provide some sort of default for the rest of the pipeline.

Example 9. An Example Try/Catch
<p:try>
  <p:group>
    <p:http-request>
      <p:input port="source">
        <p:inline>
          <c:request method="post" href="http://example.com/form-action">
            <c:entity-body content-type="application/x-www-form-urlencoded">
              <c:body>name=W3C&amp;spec=XProc</c:body>
            </c:entity-body>
          </c:request>
        </p:inline>
      </p:input>
    </p:http-request>
  </p:group>
  <p:catch>
    <p:identity>
      <p:input port="source">
        <p:inline>
          <c:error>HTTP Request Failed</c:error>
        </p:inline>
      </p:input>
    </p:identity>
  </p:catch>
</p:try>

4.7 Atomic Steps

In addition to six step types described in the precedingas sections,contained XProcsteps provides a standard library ofnot atomic step types. The full vocabulary of standards steps is described in Section 7, “Standard Step Library”.

Allany of the standard, atomicignored namespaces. steps are invoked in the same way:steps:

<pfx:atomic-step
  name? = NCName>
    (p:input |
     p:with-option |
     p:with-param |
     p:log)*
</pfx:atomic-step>

WhereEach atomic step pfx:atomic must be in the XProc namespace and mustp:pipeline be declared in either the standard library for thep:declare-step XProc version supported by the processor or explicitly imported by the surrounding pipeline (see Section 2.11, “Versioning Considerations).

4.8 ExtensionPipelines Steps

Pipelinecan authors may alsothemselves (recursion have access to additionalpipelines defined in steps not definedand to or described by this specification. Atomic extension steps are invoked just like standard steps:a library.

<pfx:atomic-step
  name? = NCName>
    (p:input |
     p:with-option |
     p:with-param |
     p:log)*
</pfx:atomic-step>

Extension steps muststep notelement be in the XProc namespace andas there musttype be a visible step declaration at the point of use (see Section 3.2, “Scoping of Names”).

If the relevant step declaration has no subpipelinep:declare-step, then that step invokes the declared atomic step,step. If which the processor must know how to perform. Thesesame steps are implementation-defined extensions.

If the relevant step declaration has a subpipelinep:pipeline, then that step runs the declared subpipeline.pipeline These stepsby that are user-type or implementation-definedname. The extensions. Pipelinespresence can refer to themselves (recursionsteps is implementation-defined; allowed), to pipelines defined in imported libraries,standard and to other pipelines in the same library ifwhat they are in a library.contain.

It is a static error (err:XS0010) if a pipeline contains a step whose specified inputs, outputs, and options do not match the signature for steps of that type.

It is a dynamic error (err:XD0017) if the running pipeline attempts to invoke a step which the processor does not know how to perform.

The presence of other compound steps is implementation-defined; XProc provides no standard mechanism for defining them or describing what they can contain. It is a static error (err:XS0048) to use a declared step as a compound step.

4.8.1 Syntactic Shortcut for Option Values

Namespace qualified attributes on a step are extension attributes. Attributes, other than name, that are not namespace qualified are treated as a syntactic shortcut for specifying the value of an option. In other words, the following two steps are equivalent:

The first step uses the standard p:with-option syntax:

<ex:stepType>
  <p:with-option name="option-name" select="'some value'"/>value="5"/>
</ex:stepType>

The second step uses the syntactic shortcut:

<ex:stepType option-name="some value"/>option-name="5"/>

Note that there are significant limitations to this shortcut syntax:

  1. It only applies to option names that are not in a namespace.

  2. It only applies to option names that are not otherwise used on the step, such as “name”.

  3. It can only be used to specify a constant value. Options that are computed atwith a runtimeselect expression must be written using the longer form.

It is a static error (err:XS0027) if an option is specified with both the shortcut form and the long form. It is a static error (err:XS0031) to use an an option on an atomic step that is not declared on steps of that type.

The syntactic shortcuts apply equally to standard atomic steps and extension atomic steps.

5 Other pipeline elements

5.1 p:input

A p:input identifies an input port for a step. In some contexts, p:input declares that a port with the specified name exists and identifies the properties of that port. In other contexts, it provides a binding for a port with a specified name (in which case it must have been declared elsewhere. Andelsewhere). in some contexts, it does both. The semantics of p:input are are complicated further by the fact that there are two kinds of inputs, inputs, ordinary “document” inputs and “parameter” inputs.

5.1.1 Documentdeclaration and a binding. In other contexts, it is only Inputs
On a p:declare-step, the p:input element is only a declaration. On a p:pipeline, it is both a a binding.

TheAn input declaration ofmay include a default binding. If no binding is provided for an input port which has a documentdefault binding, then the input identifiesis treated as if the namedefault binding appeared. A default binding does not of the requirement that a port,primary input port is whether or not the processor, nor is it used when no default readable port acceptsis defined. In other words, a p:declare-step or a sequence,p:pipeline can define defaults for all of its inputs, whether they are primary or notnot, but defining a default for a primary input usually has no effect. It's never used by an atomic step since the the step, when it's called, portwill always is a primary input port, to the default readable port (or cause a static error). The only case where it has and may provide a p:pipeline when that pipeline is invoked directly by the processor. In that case, the processor must use the default binding if no external binding is provided for the port. It is a static error for a p:pipe to appear in a default binding. Document Inputs The declaration Anof a document input declarationidentifies the name of the port, whether or not the port accepts a sequence, and whether or has the followingport is form:a primary input port.

<p:input
  port = NCName
  sequence? = boolean
  primary? = boolean
  kind? = "document"
  select? = XPathExpression>
    (p:empty |
      (p:document |
       p:inline)+)?
</p:input>

The port attribute defines the name of the port. It is a static error (err:XS0011) to identify two ports with the same name on the same step.

The sequence attribute determines whether or not a sequence of documents is allowed on the port. If sequence is not specified, or has the value “false”, then it is a dynamic error (err:XD0006) unless exactly one document appears on the declared port.

The primary attribute is used to identify the primary input port. An input port is a primary input port if primary is specified with the value “true” or if the step has only a single input port and primary is not specified. It is a static error (err:XS0030) to specify that more than one input port is the primary.

The kind attribute distinguishes between between the two kinds of inputs: document inputs and parameter inputs. An input port is a document input port if kind is specified with the value document“document” or if kind is not specified.

If ap:declare-step, default bindingp:input is provided, thenthe input selectport. mayIt be used to selectis a portionstatic oferror the input identifiedif by the p:empty,declaration p:document,of or p:inline elementsdocument in the p:input.

On an atomic step, it specifies a binding for the input. An input binding has the following form:input:

<p:input
  port = NCName
  select? = XPathExpression>
    (p:empty |
      (p:pipe |
       p:document |
       p:inline)+)?
</p:input>

If no binding is provided for aprovided, primary input port, the input will be bound to the default readable port. It is a static error (err:XS0032) if no binding is provided and the default readable port is undefined.

A select expression may also be provided with a binding. The select expression, if specified, applies the specified XPath select expression to the document(s) that are read. Each selected node is wrapped in a document (unless it is a document) and provided to the input port. In other words,

<p:input port="source">
  <p:document href="http://example.org/input.html"/>
</p:input>

provides a single document, but

<p:input port="source" select="//html:div" xmlns:html="http://www.w3.org/1999/xhtml">
  <p:document href="http://example.org/input.html"/>
</p:input>

provides a sequence of zero or more documents, one for each html:div in http://example.org/input.html. (Note that in the case of nested html:div elements, this may result in the same content being returned in several documents.)

A select expression can equally be applied to input read from another step. This input:

<p:input port="source" select="//html:div" xmlns:html="http://www.w3.org/1999/xhtml">
  <p:pipe step="origin" port="result"/>
</p:input>

provides a sequence of zero or more documents, one for each html:div in the document (or each of the documents) that is read from the result port of the step named origin.

It is a dynamic error (err:XD0016) if the select expression on a p:input returns anything other than a possibly empty set of element or document nodes.

When a p:input is used in any context where it provides only a binding (e.g., on an atomic step), it is a static error (err:XS0012) if the port given does not match the name of an input port specified in the step's declaration.

An input declaration may include a default binding. If no binding is provided for an input port which has a default binding, then the input is treated as if the default binding appeared.

A default binding does not satisfy the requirement that a primary input port is automatically connected by the processor, nor is it used when no default readable port is defined. In other words, a p:declare-step or a p:pipeline can define defaults for all of its inputs, whether they are primary or not, but defining a default for a primary input usually has no effect. It's never used by an atomic step since the the step, when it's called, will always bind the primary input port to the default readable port (or cause a static error). The only case where it has value is on a p:pipeline when that pipeline is invoked directly by the processor. In that case, the processor must use the default binding if no external binding is provided for the port.

5.1.2 Parameter Inputs

The declaration of a parameter input identifies the name of the port and that the port is a parameter input.

<p:input
  port = NCName
  sequence? = boolean
  primary? = boolean
  kind = "parameter" />

The port attribute defines the name of the port. It is a static error (err:XS0011) to identify two ports with the same name on the same step.

The sequence attribute determines whether or not a sequence of documents is allowed on the port. A sequence of documents is always allowed on a parameter input port. It is a static error (err:XS0040) to specify any value other than “true”.

The primary attribute is used to identify the primary parameter input port. An input port is a primary parameter input port if it is a parameter input port and primary is specified with the value “true” or if the step has only a single parameter input port and primary is not specified. It is a static error (err:XS0030) to specify that more than one parameter input port is the primary.

The kind attribute distinguishes between the two kinds of inputs: document inputs and parameter inputs. An input port is a parameter input port only if the kind attribute is specified with the value “parameter”. It is a static error (err:XS0033) to specify any kind of input other than “document” or “parameter”.

A parameter input port is a distinguished kind of input port. It exists only to receive computed parameters; if a step does not have a parameter input port then it cannot receive parameters. A parameter input port must satisfy all the constraints of a normal, document input port.

It is a static error (err:XS0035) if the declaration of a parameter input port contains a binding; parameter input port declarations must be empty.

When used on a step, parameter input ports always accept a sequence of documents. If no binding is provided for a primary parameter input port, then the port will be bound to the primary parameter input port of the pipelinep:pipeline which contains the step. If no binding is provided for a parameter input port other than the primary parameter input port, then the port will be bound to an empty sequence of documents. It is a static error (err:XS0055) if a primary parameter input port has no binding and the pipeline that contains the step has no primary parameter input port.

If a parameter input port on a p:pipeline is not bound, it is treated as if it was bound to an automatically created p:sink step. In other words, if a p:pipeline does not contain any steps that have parameter input ports, or if those ports are all explicitly bound elsewhere, the parameter input port is ignored. In this one case, it is not an error for an input port to be unbound.

If a binding is manufactured for a primary parameter input port, that binding occurs logically last among the other parameters, options, and bindings passed to the step. In other words, the parameter values that appear on that port will be used even if other values were specified with p:with-param elements. Users can change this priority by making the binding explicit and placing any p:with-param elements that they wish to function as overrides after the binding.

All of the documents that appear on a parameter input must either be c:param documents or c:param-set documents.

A step which accepts a parameter input reads all of the documents presented on that port, using each c:param (either at the root or inside the c:param-set) to establish the value of the named parameter. If the same name appears more than once, the last value specified is used. If the step also has literal p:with-param elements, they are are also considered in document order. In other words, p:with-param elements that appear before the parameter input may be overridden by the computed parameters; p:with-param elements that appear after may override the computed values.

Consider the example in Example 10, “A Parameter Example”.

Example 10. A Parameter Example
<p:pipeline xmlns:p="http://www.w3.org/ns/xproc">name="main"
       xmlns:p="http://www.w3.org/ns/xproc"
       xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<p:input port="source"/>
<p:input port="parameters" kind="parameter"/>
<p:output port="result"/>

<p:xslt>
  <p:input port="source">
    <p:pipe step="main" port="source"/>
  </p:input>
  <p:input port="stylesheet">
    <p:document href="http://example.com/stylesheets/doc.xsl"/>
  </p:input>
  <p:with-param name="output-type" select="'html'"/>
  <p:input port="parameters">
    <p:pipe step="main" port="parameters"/>
  </p:input>
</p:xslt>

</p:pipeline>

This p:pipeline declares that it accepts parameters. Suppose that (through some implementation-defined mechanism) I have passed the parameters “output-type=fo” and “profile=unclassified” to the pipeline. These parameters are available on the parameters input port.

When the XSLT step runs, it will read those parameters and combine them with any parameters specified literally on the step. Because the parameter input comes after the literal declaration for output-type on the step, the XSLT stylesheet will see both values that I passed in (“output-type=fo” and “profile=unclassified”).

If the parameter input came before the literal declaration, then the XSLT stylesheet would see “output-type=html” and “profile=unclassified”.

Most steps don't bother to declare parameter inputs, or provide explicit bindings for them, and “the right thing” usually happens.

5.1.2.1 The c:param element

A c:param represents a parameter on a parameter input.

<c:param
  name = QName
  namespace? = anyURI
  value = string />

The name attribute of the c:param must have the lexical form of a QName.

If the namespace attribute is specified, then the expanded name of the parameter is constructed from the specified namespace and the local-name part of the name value (in other words, the prefix, if any, is ignored).

If the namespace attribute is not specified, and the name contains a colon, then the expanded name of the parameter is constructed using the name value and the namespace declarations in-scope on the c:param element.

If the namespace attribute is not specified, and the name does not contain a colon, then the expanded name of the parameter is in no namespace.

Any namespace-qualified attribute names that appear on the c:param element are ignored. It is a dynamic error (err:XD0014) for any unqualified attribute names other than “name”, “namespace”, or “value” to appear on a c:param element.

5.1.2.2 The c:param-set element

A c:param-set represents a set of parameters on a parameter input.

<c:param-set>
    c:param*
</c:param-set>

The c:param-set contains zero or more c:param elements. It is a dynamic error (err:XD0018) if the parameter list contains any elements other than c:param.

Any namespace-qualified attribute names that appear on the c:param-set element are ignored. It is a dynamic error (err:XD0014) for any unqualified attribute names to appear on a c:param-set element.

5.2 p:iteration-source

A p:iteration-source identifies input to a p:for-each.

<p:iteration-source
  select? = XPathExpression>
    (p:empty |
      (p:pipe |
       p:document |
       p:inline)+)?
</p:iteration-source>

The select attribute and binding of a p:iteration-source work the same way that they do in a p:input.

5.3 p:viewport-source

A p:viewport-source identifies input to a p:viewport.

<p:viewport-source>
    (p:pipe |
      p:document |
      p:inline)?
</p:viewport-source>

Only one binding is allowed and it works the same way that bindings work on a p:input. It is a dynamic error (err:XD0006) unless exactly one document appears on the p:viewport-source. No select expression is allowed.

5.4 p:output

A p:output identifies an output port, optionally binding an input fordeclaring it, if necessary.

<p:output
  port = NCName
  sequence? = boolean
  primary? = boolean />

The port attribute defines the name of the port. It is a static error (err:XS0011) to identify two ports with the same name on the same step. It is a static error if the port given does not match the name of an output port specified in the step's declaration.

An output declaration can indicate if a sequence of documents is allowed to appear on the declared port. If sequence is specified with the value “true”, then a sequence is allowed. If sequence is not specified on p:output, or has the value “false”, then it is a dynamic error (err:XD0007) if the step does not produce exactly one document on the declared port.

The primary attribute is used to identify the primary output port. An output port is a primary output port if primary is specified with the value “true” or if the step has only a single output port and primary is not specified. It is a static error (err:XS0014) to identify more than one output port as primary.

On compound steps, the declaration may be accompanied by a binding for the output.

<p:output
  port = NCName
  sequence? = boolean
  primary? = boolean>
    (p:empty |
      (p:pipe |
       p:document |
       p:inline)+)?
</p:output>

It is a static error (err:XS0029) to specify a binding for a p:output inside a p:declare-step for an atomic step..

If a binding is provided for a p:output, documents are read from that binding and those documents form the output that is written to the output port. In other words, placing a p:document inside a p:output causes the processor to read that document and provide it on the output port. It does not cause the processor to write the output to that document.

5.5 p:log

A p:log element is a debugging aid. It associates a URI with a specific output port on a step:

<p:log
  port = NCName
  href? = anyURI />

The semantics of p:log are that it writes to the specified URI whatever document or documents appear on the specified port. If the href attribute is not specified, the location of the log file or files is implementation-defined.

How a sequence of documents is represented in a p:log is implementation-defined.

It is a static error (err:XS0026) if the port specified on the p:log is not the name of an output port on the step in which it appears or if more than one p:log element is applied to the same port.

Implementations may, at user option, ignore all p:log elements.

Note

This element represents a potential security risk: running unexamined 3rd-party pipelines could result in vital system resources being overwritten.

5.6 p:serialization

The p:serialization element allows the user to request serialization properties on a p:pipeline output.

<p:serialization
  port = NCName
  byte-order-mark? = boolean
  cdata-section-elements? = NMTOKENS
  doctype-public? = string
  doctype-system? = string
  encoding? = string
  escape-uri-attributes? = boolean
  include-content-type? = boolean
  indent? = boolean
  media-type? = string
  method? = QName
  normalization-form? = NFC|NFD|NFKC|NFKD|fully-normalized|none|xs:NMTOKEN
  omit-xml-declaration? = boolean
  standalone? = true|false|omit
  undeclare-prefixes? = boolean
  version? = string />

If the pipeline processor serializes the output on the specified port, it must use the serialization options specified. If the processor is not serializing (if, for example, the pipeline has been called from another pipeline), then the p:serialization must be ignored. The processor may reject statically a pipeline that requests serialization options that it cannot provide.

The default value of any unspecified serialization option is implementation-defined.

The semantics of the attributes on a p:serialization are described in Section 7.3, “Serialization Options”.

It is a static error (err:XS0039) if the port specified on the p:serialization is not the name of an output port on the pipeline in which it appears or if more than one p:serialization element is applied to the same port.

5.7 Variables, Options,Options and Parameters

Variables, options,Options and parameters provide a mechanism for pipeline authorsdistinguishes them to construct temporary results and hold onto them forexpected to reuse.

Variablesknow are created in compound steps and, like XSLT variables, are single assignment, though they mayspecified be shadowed by subsequent declarations of other variables with the same name.

Options canmay be declared on atomic or compound steps. The value of an optionthey can be specified by the caller invoking the step. Anyname value specified by the caller takes precedence over any default value specified inby the declaration.

Parameters, unlike options and variables, have names that can be computed at runtime. The most common use of parameters is to pass parameter values to XSLT stylesheets.

5.7.1 p:variable

A p:variable declares a variable and associates a value with it.

The name of the variable must be a QName. Ifelement it does not contain a prefix then it is in no namespace. It is a static error (err:XS0028)both to declare an option or variable in the XProc namespace.

<p:variable
  name = QName
  select = XPathExpression>
    ((p:empty |
      p:pipe |
      p:document |
      p:inline)? &
     p:namespaces*)
</p:variable>

If a select expression is given, it is evaluated as an XPath expression using the context defined in Section 2.6.1, “Processor XPath Context”, for the enclosing container, with the addition of bindings for allOn preceding-sibling p:variable, and p:optiondeclares elements. Regardless ofthat the implicit type ofaccepts the expression, whennamed XPath 1.0 is being used, the stringa value of the expressiondefault becomes the value of the variable; when XPath 2.0 is beingfor used, the type is treated as anoption. untypedAtomic.

Since all in-scope bindings are present in the Processor XPath Context as variable bindings, select expressions may refer to the value of in-scope bindings by variable reference. IfOn a variablecompound step, reference uses a QName that isfor not the name of an in-scopeoption, binding,simultaneously an XPath evaluation error will occur.it.

If a selectan expression is given, the readable ports available for document binding are the readable ports in the environmentatomic inherited by the first step in the surrounding container's contained steps. However, in order to avoid, ordering paradoxes, it is a static error (err:XS0019)value for a variable's document binding to refer to the outputone port of any step in the surrounding container's contained steps.

If a select expression is given but no document binding is provided, the implicit binding is to the default readable portspecified in the environment inherited by the first step in thedeclaration. surrounding container's contained steps. It is a static error (err:XS0032) if no document binding is provided and the default readable port is undefined. It is a dynamic error (err:XD0008) if a document sequence is specified in the document binding for a p:variable. In an XPath 1.0 implementation, ifon p:empty is given as the document binding, an empty document node is usedwith as the context node. In an XPath 2.0 implementation, the context item is undefined.name.

5.7.2 p:optionDeclaring Options

A p:optionare declares an option and may associate a default value with it. The putting p:option tag can only bethe used in a p:declare-step or a p:pipeline (which isthat a syntactic abbreviation foratomic a step declaration).

The name of the option must be a QName. If it does not contain a prefix then it is in no namespace. It is a static error (err:XS0028) to declare an option or variable in the XProc namespace.

<p:option
  name = QName
  required? = boolean />

An option may be declared as required. If an option is required, it is a static error (err:XS0018) to invoke the step without specifying a value for that option.

If an option is not declared to be required, it may be given a default value. The value is specified with a select attribute.

It is a static error (err:XS0017) to specify that an option is both required and has a default value.

If a select attribute is specified, its contentUsing is an XPath expression which will be evaluated to provide the value of the variable, which may differ from one instance of the step type to another.

<p:option
  name = QName
  select = XPathExpression />

Options

The selectare expression is only evaluated when its actual value is needed by an instance of the step type being declared. In this case, it is evaluated as described in Section 5.7.3, “p:with-option except that

  • the context node is an empty document node;

  • the variable bindings consist only of bindingsto for options whose declarationon precedes the p:option itself in the surroundingparticular step signature;

  • the in-scope namespaces are the in-scope namespaces of thewith p:option. itself.

It follows that if the select expressionThe contains a variable reference that uses a QName that is not the name of an preceding sibling p:option declaration, an XPath evaluation error will occur.

Regardless of the implicit type of the expression, when XPath 1.0 is being used, the string value of the expression becomes the value of the option; when XPath 2.0it is being used, the value is an untypedAtomic.

5.7.3 p:with-option

A p:with-option provides an actual value for an optionnamespace. when a step is invoked.

The name of the option must be given a QName.value If it does not contain a prefixused. then it is in no namespace. It is a static error (err:XS0031) to use an undeclared option name in p:with-optionan if thestep. Assigning step type being invokedOptions When has not declared an option with that name.

Itit ismay a static error (err:XS0004)given to include more than one p:with-optionis with the samemust option namegiven a as part of the same step invocation.value.

The actual value is specified in two ways: with a select attribute. Itvalue isattribute. If a static error (err:XS0016) if the select attribute is not specified.given, The value of the select attribute isas an XPath expression which will be evaluated to provide the value of. the variable.

<p:with-option
  name = QName
  select = XPathExpression>
    ((p:empty |
      p:pipe |
      p:document |
      p:inline)? &
     p:namespaces*)
</p:with-option>

Regardless of the implicit type of the expression,The when XPath 1.0 is being used, the string value of the expression becomes the value of the option;option. Since when XPath 2.0all is being used, the value is an untypedAtomic.

All in-scope bindings for the step instanceoptions itself are present in the Processor XPath Context as variable bindings, so select expressions may refer to any option or variable bound in thoseof in-scope bindingsoptions by variable reference. If It is a static error if the variable reference uses a QName that is not the name of an in-scope option. It is a dynamic error if bindinga or preceding sibling option, anspecified XPath evaluation error will occur.a p:option.

If a select expression is used but no document binding is provided, the implicit binding is to the default readable port. It is a static error (err:XS0032) if no document binding is is provided and the default readable port is undefined. ItIf is a dynamic error (err:XD0008) if a document sequencenode is specified in the binding for ato p:with-option. In, an XPath 1.0 implementation, if p:empty is given as the document binding, an empty document node is used instead as the contextcontext. If a node.value In an XPathspecified, its 2.0 implementation, the contextvalue of item is undefined.option.

5.7.4 p:with-param

The p:with-param element iscan be used to establish the namespace value of a parameter. The parametersee . If mustthe be given a valuean when itoption is used. (Parameter names aren't known in advance; there'snamespace no provision for declaring them.)

Thein-scope name ofon the parameter mustare be aits value QName. If it does not contain a prefix thenstep it is in no namespace. . It is a static error (err:XS0028) to use the XProc namespace inif the name of a parameter.

<p:with-param
  name = QName
  select = XPathExpression
  port? = NCName>
    ((p:empty |
      p:pipe |
      p:document |
      p:inline)? &
     p:namespaces*)
</p:with-param>

Thearen't values of parameters forthere's no a step mustdeclaring bethem.) The computed aftervalue all the options in the step's signaturewith have hada theirselect valuesor value computed. attribute. If a select expression is given on a p:with-param,given, it is evaluated as an XPath expression using the context defined in Section 2.6.1, “Processor XPath Context”, for the surrounding step, with. the addition of variableThe bindings for all options declared in the surrounding step's signature.

Regardless of the implicit type of the expression, when XPath 1.0 is being used, the string value of the expression becomes the value of the parameter;parameter. Since when XPath 2.0 is being used, the value is an untypedAtomic.

Allall in-scope bindings for the step instance itselfoptions are present in the Processor XPath Context as variable bindings, so select expressions may refer to any option or variable bound in thoseby in-scopevariable bindings,reference. asIt well as tostatic anyerror option declared in the step signature, by variable reference. If a variable reference uses a QName that is not the name of an in-scope option. It bindingis or declareddynamic option,error anif a document sequence is specified XPath evaluation error will occur.a p:parameter.

If a select expression is used but no document binding is provided, the implicit binding is to the default readable port. It is a static error (err:XS0032) if no document binding is provided and the default readable port is undefined. It If the context node is abound to dynamicp:empty, erroran ifempty a document sequencenode is specifiedused instead in the binding forcontext. If a p:with-param. In anvalue XPath 1.0 implementation,specified, its if p:emptybecomes is given as the document binding,parameter. It an empty documenta nodestatic iserror used asif the value context node. In an XPath 2.0select implementation, thevalue, or context item is undefined.specified.

If the optional port attribute is specified, then the parameter appears on the named port, otherwise the parameter appears on the step's primary parameter input port. It is a static error (err:XS0034) if the specified port is not a parameter input port or if no port is specified and the step does not have a primary parameter input port.

5.7.5 Namespaces on variables, options, and parametersNamespaces

Variable, optionOption and parameter values carry with them not only their literal or computed string value but also a set of namespaces. To see why this is necessary, consider the following step:

<p:delete xmlns:p="http://www.w3.org/ns/xproc">
  <p:with-option name="match" select="'html:div'"
            xmlns:html="http://www.w3.org/1999/xhtml"/>
</p:delete>

The p:delete step will delete elements that match the expression “html:div”, but that expression can only be correctly interpreted if there's a namespace binding for the prefix “html” so that binding has to travel with the option.

The default namespace bindings associated with a variable, option or parameter value are computed as follows:

  1. If the select attribute was used to specify the value and it consisted of a single VariableReference (per [XPath 1.0] or [XPath 2.0], as appropriate), then the namespace bindings from the referenced option or variable are used.

  2. If the select attribute was used to specify the value and it evaluated to a node-set, then the in-scope namespaces from the first node in the selected node-set (or, if it's not an element, its parent) are used.

    The expression is evaluated in the appropriate context, See Section 2.6, “XPaths in XProc”.

  3. Otherwise, the in-scope namespaces from the elementp:option providing thep:parameter value are used.

The default namespace is never included in the namespace bindings for an a variable, optionp:option or parameterp:parameter. value. Unqualified names are always in no-namespace.

Unfortunately, in more complex situations, there may be no single variable, option or parameter that can reliably be expected to have the correct set of namespace bindings. Consider this pipeline:

<p:pipeline type="ex:delete-in-div"
            xmlns:p="http://www.w3.org/ns/xproc"
            xmlns:ex="http://example.org/ns/ex"
            xmlns:h="http://www.w3.org/1999/xhtml">
<p:input port="source"/>
<p:output port="result"/>
<p:option name="divchild" required="true"/>

<p:delete>
  <p:with-option name="match" select="concat('h:div/',$divchild)"/>
</p:delete>

</p:pipeline>

It defines an atomic step (“ex:delete-in-div”) that deletes elements that occur inside of XHTML div elements. It might be used as follows:

<ex:delete-in-div xmlns:p="http://www.w3.org/ns/xproc" xmlns:ex="http://www.example.org/xproc-extensions"><ex:delete-in-div>
  <p:with-option name="divchild" select="html:p[@class='delete']"
                 xmlns:html="http://www.w3.org/1999/xhtml"/>
</ex:delete-in-div>

In this case, the match option passed to the p:delete step needs both the namespace binding of “h” specified in the ex:delete-in-div pipeline definition and the namespace binding of “html” specified in the divchild option on the call of that pipeline. It's not sufficient to provide just one of the sets of bindings.

The p:namespaces element can be used as a child of p:variable, p:with-optionp:option or p:with-param to provide explicit bindings.

<p:namespaces
  binding? = QName
  element? = XPathExpression
  except-prefixes? = prefix list />

The namespace bindings specified by a p:namespaces element are determined as follows:

  1. If the binding attribute is specified, it must contain the name of a single in-scope binding.option. The namespace bindings associated with that binding are used. It is a static error (err:XS0020) if the binding attribute on p:namespaces is specified and its value is not the name of an in-scope binding. used;

  2. If the element attribute is specified, it must contain an XPath expression which identifies a single element node (the input binding for this expression is the same as the binding for the p:option or p:with-param which contains it). The in-scope namespaces of that node are used.

    The expression is evaluated in the appropriate context, See Section 2.6, “XPaths in XProc”.

    It is a dynamic error (err:XD0009) if the element attribute on p:namespaces is specified and it does not identify a single element node.

  3. If neither binding nor element is specified, the in-scope namespaces on the p:namespaces element itself are used.

Irrespective of how the set of namespaces are determined, the except-prefixes attribute can be used to exclude one or more namespaces. The value of the except-prefixes attribute must be a sequence of tokens, each of which must be a prefix bound to a namespace in the in-scope namespaces of the p:namespaces element. All bindings of prefixes to each of the namespaces thus identified are excluded. It is a static error (err:XS0051) to specify both ifoption the except-prefixes attributeon the on p:namespaces doeselement. It not containis a liststatic of tokens orerror if any of those tokens is not a prefix bound to a namespacespecified in the in-scope namespacesprefix of the p:namespaces element.

Itlist is a static error (err:XS0041)not to specify both bindingan and elementin-scope on the same p:namespaces element. namespace.

If a p:variable, p:with-optionp:option or p:with-param includes one or more p:namespaces elements, then the union of all the namespaces specified on those elements are used as the bindings for the variable, option or parameter value. In this case, the in-scope namespaces on the p:variable, p:with-optionp:option or p:with-param are ignored. It is a dynamic error (err:XD0013) if the specified namespace bindings are inconsistent; that is, if the same prefix is bound to two different namespace names.

For example, this would allow the preceding example to work:

<p:pipeline type="ex:delete-in-div"
            xmlns:p="http://www.w3.org/ns/xproc"
            xmlns:ex="http://example.org/ns/ex"
            xmlns:h="http://www.w3.org/1999/xhtml">
<p:input port="source"/>
<p:output port="result"/>
<p:option name="divchild" required="true"/>

<p:delete>
  <p:with-option name="match" select="concat('h:div/',$divchild)">
    <p:namespaces xmlns:h="http://www.w3.org/1999/xhtml"
                  xmlns:html="http://www.w3.org/1999/xhtml"/>
  </p:with-option>
</p:delete>

</p:pipeline>

The p:namespaces element provides namespace bindings for both of the prefixes necessary to correctly interpret the expression ultimately passed to the p:delete step.

This solution has the weakness that it depends on knowing the bindings that will be used by the caller. A more flexible solution would use the binding attribute to copy the bindings from the caller's option value.

<?xml version='1.0'?>
<p:pipeline type="ex:delete-in-div" 
            name="main"
       xmlns:p="http://www.w3.org/ns/xproc"
            xmlns:ex="http://example.org/ns/ex"
       xmlns:h="http://www.w3.org/1999/xhtml">
<p:input xmlns:ex="http://example.org/ns/ex"port="source"/>
<p:output xmlns:h="http://www.w3.org/1999/xhtml">
<p:option name="divchild" required="true"/>

<p:delete>
  <p:with-option name="match" select="concat('h:div/',$divchild)">
    <p:namespaces binding="divchild"/>
    <p:namespaces xmlns:h="http://www.w3.org/1999/xhtml"/>
  </p:with-option>
</p:delete>

</p:pipeline>

This example will succeed as long as the caller-specified option does not bind the “h” prefix to something other than the XHTML namespace.

5.8 p:declare-step

A p:declare-step provides the type and signature of an atomic step or. pipeline. It declares the inputs, inputs, outputs, and options for all steps of that type.

<p:declare-step
  name? = NCName
  type? = QName
  psvi-required? = boolean
  xpath-version? = string>
    (p:input |
     p:output |
     p:option |
     p:log |
     p:serialization)*,
    ((p:declare-step |
      p:import)*,
     subpipeline)?
</p:declare-step>

Implementations may use extension attributes to provide implementation-dependent information about a declared step. For example, such an attribute might identify the code which implements steps of this type. The value of the type can be from any namespace provided that the expanded-QName of the value has a non-null namespace URI. It is a static error (err:XS0025) if the expanded-QName value of the type attribute is in no namespace. ExceptIf the as described in Section 2.11, “Versioning Considerations, the type is the XProc namespacenamespace, then the declaration must not be used in the type of steps. Neither users nor implementers may define additional steps in the XProc namespace.

p:pipeline-library

Irrespective of the context in whichA the p:declare-step occurs, there are initially nois option or variable names in-scope inside a p:declare-step. Thatcollection is, p:option and p:variable elements canstep refer to values declared by their preceding siblings, but not bydefinitions. any of their ancestors.

A step declarationp:pipeline-library is notspecifies a step in its own right. Siblingwith steps cannot refer to the inputsnamespace or outputs of a p:declare-step using p:pipe; onlyall instances of the type can be referenced.

5.8.1 Declaringuntyped atomic steps

Whenthat declaring an atomic step, theoccur subpipeline in the declaration mustlibrary be empty. And, conversely, if theare subpipeline in a declaration is empty, the declaration must bethat for an atomic step.namespace.

Implementations may use extension attributes to provide implementation-dependent information about a declared step. For example, such an attributegiven might identify the code whichfollowing implements stepspipeline of this type.library:

It<p:pipeline-library is not an error for a pipeline toxmlns:p="http://www.w3.org/ns/xproc" namespace="http://example.com/ns/pipelines"> <p:import include declarations forhref="ancillary-library.xml"/> steps that a particular processor does not know how to implement.<p:import It is, of course, an error to attempt to evaluate such steps.

href="other-pipeline.xml"/>

If<p:pipeline p:logname="validate"> <!-- or p:serializationdefinition elements appear in the declaration of an atomic step, they will only be used if the atomic--> step</p:pipeline> <p:pipeline is directlytype="my:format" xmlns:my="http://example.com/vanity/mine"> <!-- evaluated by the processor. They have no effect if the step appears in a subpipeline; only the serialization optionsdefinition of the “topformat level” step or pipeline are used because that is the only--> step which the processor is required to serialize.

</p:pipeline>
5.8.2 Declaring pipelines
</p:pipeline-library>

When a p:declare-stepThe declares a pipeline, that pipeline encapsulates the“validate” is behavior ofin the specified subpipelinehttp://example.com/ns/pipelines. Its children declare inputs,means outputs, and options that the pipeline exposes and identify the steps inimporting its subpipeline.

Thea subpipeline may include declarationsname of additional steps (e.g., other pipelines or other step types that are provided by a particular implementation or in some implementation-defined way) and import other pipelines. If a pipeline has been imported, it may be invoked as a step within the subpipeline that imported it.form:

<ex:validate> … </ex:validate>

The environment inherited by“ex” the