XProc V2.0 Requirementsxproc-v2-req2013-11-05XMLAlex MilowskiInvited expertalex@milowski.orgJames FullerInvited expertjim.fuller@webcomposite.comNorman WalshMarkLogic Corporationnorman.walsh@marklogic.comThis is the requirements document for XProc V2.0.This section describes the status of this document at
the time of its publication. Other documents may supersede this
document. A list of current W3C publications and the latest revision
of this technical report can be found in the W3C technical reports index
at http://www.w3.org/TR/.This document is a First Public Working Draft produced by the
XML Processing Model
Working Group which is part of the
XML Activity.
Publication as a First Public Working Draft does not imply
endorsement by the W3C Membership. This is a draft document and may be
updated, replaced or obsoleted by other documents at any time. It is
inappropriate to cite this document as other than work in
progress.Please send comments about this document to
public-xml-processing-model-comments@w3.org (public
archives are available).This document was produced by a group operating under the 5
February 2004 W3C Patent Policy. W3C maintains a public
list of any patent disclosures made in connection with the
deliverables of the group; that page also includes instructions for
disclosing a patent. An individual who has actual knowledge of a
patent which the individual believes contains Essential
Claim(s) must disclose the information in accordance with section
6 of the W3C Patent Policy.IntroductionUser and implementor experience with
has exposed a number of ways in which the XProc language could be
improved. The Working Group's focus for V2.0 is on usability
improvements.
The requirements in this document are divided into two groups,
a set of “must” requirements and a set of
“should” requirements. The Working Group feels that
the requirements in the “must” category are absolutely essential. The
requirements listed in the “should” category are viewed as either less
critical or more speculative in nature.MUST RequirementsThe following requirements are considered “must”
requirements for XProc V2.0.Simplify parametersExperience with parameters in XProc 1.0 reveals that they are
too complicated. They often cause user confusion and introduce
syntactic complexity not justified by their function. XProc v2.0 must
dramatically simplify parameters, perhaps simply removing parameter
ports altogether without replacing them with a new mechanism of
equivalent power (and complexity).
Integrate non-XML documents into pipelinesExperience has shown that real-world pipelines often involve non-XML
documents. Several workarounds have been invented for special cases.
The limitation that V1.0 can only pass XML between steps makes some
pipelines difficult, if not impossible, to write.Providing the ability to allow non-XML documents to flow between
steps opens up the possibility of writing simple pipelines to work
with images, JSON, Turtle, EPUB, etc.Align with XPath 3.0 technologiesAlignment with
/
will keep features of XProc
consistent with modern XML technologies: error handling, serialization
options, features, etc. In addition, support for XPath 1.0 no
longer seems relevant; it adds complexity to the specification and is
unlikely to be implemented today. XPath 1.0 support will be removed
from XProc.Add explicit flow handlingThere are many pipelines for which the flow analysis does not
provide a convenient or predictable ordering of steps. Because some
steps have side effects not manifest in the pipeline, it may be
necessary to ensure a particular order. This facility is not supported
by XProc 1.0, but is available in implementation-defined extensions.
XProc 2.0 will standardize this facility.
Allow arbitrary XDM values in variablesXProc 1.0 restricts the values of variables, options, and
parameters to be only strings. This has proven to be an inconvenient
limitation. XProc 2.0 will allow variables, options, and parameters to
have any value insofar as possible. XProc 2.0 will also allow the
required types of variables, options, and parameters to be specified.
Allow attribute value templatesThe syntactic sugar that allows step options to be expressed
concisely as attribute values on a step is foiled whenever the value
of the option must be computed by the pipeline. Allowing those options
to contain XSLT-style attribute value templates (AVTs) would simplify
many pipelines. Additionally, allowing AVTs in other places, such as
the href attribute on p:document, will be considered.
introduces a feature which allows
expressions in curly braces to be evaluated in element content. This
feature is similar to the facility provided by the p:template
step. Extending XProc to support curly braces in a manner
consistent with will be considered.Support a variety of syntactic simplificationsXProc 1.0 offers relatively few default behaviors, requiring
instead that pipelines specify every construct fully. User experience
has demonstrated that this leads to very verbose pipelines and has
been a constant source of complaint. XProc 2.0 will introduce a
variety of syntactic simplifications as an aid to readability and
usability, including but not limited to:<p:pipe step="name"/> binds to the primary
output port of the step named “name”.
<p:pipe port="secondary"/> binds to the
“secondary” port of the step on which the default readable port
occurs.
<p:input port="portname" href="..."/> is a
shortcut for a nested p:document.<p:input port="portname"/> is a shortcut for
a nested p:empty.
Allow p:inline to be optional.
Allow curly brace expansion in p:inline (with an
attribute to control whether or not that behavior is enabled)
Provide a select attribute on
p:for-each/p:viewportChange all steps with a single non-primary output to have a
single primary output
Consider harmonizing p:viewport-source and
p:iteration-sourceAdd an AVT value attribute to
options, parameters, and variables (to be used instead of
select)
Document backwards-incompatibilitiesBackwards incompatiblity is painful for users and will be
avoided wherever possible. However, XProc 2.0 will introduce language
features that are not backwards compatible with 1.0. The specification
must document these incompatibilities.
SHOULD RequirementsEditorial improvementsImplementation experience has demonstrated that there are areas
of the specification that didn't get the balance right between
precision for implementors and clarity for users, for example
“non-step wrappers”. The XProc 2.0 specification should attempt to
resolve these problems without introducing inordinate complexity.
The 1.0 specification also defines the p:pipeline
element as a syntactic shortcut for a particular form of
p:declare-step. While convenient in some circumstances, it
has proven to be a source of some confusion especially among new
users. XProc 2.0 may remove the p:pipeline element.
Associate arbitrary metadata with documentsAdding metadata to documents is a natural thing for pipelines to
do, either for subsequent use by the pipeline or for eventual output.
For example, the serialization options provided in an XSLT stylesheet
could be carried forward to the eventual serialization of the result
document by the pipeline. In XProc 1.0, there's no way to maintain
that association. XProc 2.0 should support the ability to associate
processor and user-defined metadata with documents.
Support steps with a dynamic number of portsWhile most steps have a predetermined and static number of
inputs and outputs, this is not universally the case. In XProc 1.0, a
putative p:eval step which could run a dynamically
constructed pipeline, for example, suffers from the limitation that
the signature of the p:eval step usually differs from the
signature of the evaluated pipeline.
XProc 2.0 should provide a facility for supporting steps with a
variable number of inputs and outputs.
Provide improved status informationXProc 1.0 provides scant support for reporting the status of a
pipeline and providing aid to users attempt to debug pipelines.
Implementation-defined extensions have demonstrated that some
additional facilities, such as a p:message step, would be
an aid to users.XProc 2.0 will add some mechanism for reporting status messages
and will consider adding additional steps and/or language features to
aid in analysing the behavior of a running pipeline.Provide a mechanism for importing user-defined functionsExperience with user-defined functions in XQuery and XSLT
reveals that they can be a powerful addition to the language.
Providing some feature that allowed users to extend the vocabulary of
functions available in, for example, the test expressions on
p:when elements would greatly simplify some pipelines.
Such a mechanism might take the form of the ability to load
extension functions defined in, for example, XQuery, or it might
include adding the ability to define functions in XProc.
Enhance try/catchSupport for catching errors in XProc 1.0 is limited to a simple
p:try/p:catch pair, which catches and handles
all errors uniformly. To align XProc with modern languages, the
try/catch mechanism will be extended to support the ability to catch
specific errors and possibly with the addition of a “finally”
construct.
Write a primerA new user introduction to XProc would aid adoption.Consider using XDM everywhereIn addition to supporting values in
variables, options, and parameters, XProc 2.0 might allow values in more places, such as allowing p:for-each to
iterate over a sequence of strings or integers.
Consider dividing the specificationXProc 1.0 is a specification that consists of both the language
definition and the inventory of required and optional steps. Release
management might be simplified by separating the language core from
the vocabulary of steps and providing some sort of versioning strategy
that allowed the vocabulary of steps to be revised more frequently.
XProc 2.0 may be defined in more than one Rec-track specification
document.Consider additional steps and enhancementsThe vocabulary of steps available in XProc is extensible. Users
and implementors have developed additional steps. For example, to
support pipelines that produce EPUB documents or manipulate files on
disk. It is worth considering which, if any, new steps should be
elevated to the XProc namespace. The candidates include, but are not
limited to:p:zip and p:unzipp:template and p:in-scope-namesp:evalSemantic web steps (p:sparql, p:rdfa, ...)?
Operating system steps (p:env, ...)?
File system steps (p:mkdir, p:copy, ...)?
Simplify p:viewport and allow it to have multiple outputsThe use of an optional, single p:output binding in p:viewport
creates confusion for users. The binding is used both to connect the
inner workings of the viewport and as the name of the output port as
seen from the outside.
In addition, the fact that viewport can produce only a single
result means that for some tasks, multiple passes are required, using
a combination of p:viewport and p:for-each.
Consider the task of changing image references in an XHTML document
from .svg to .png and generating the sequence of .png images. In XProc
1.0, this requires a p:viewport and a p:for-each.
Adding an explicit p:viewport-result allows us to
remove the confusion between the input and the name of the output.
Allowing multiple outputs allows us to collapse the
p:viewport and p:for-each logic into a single
step.
((p:viewport-source? &
p:viewport-result? &
p:output* &
p:log?),
subpipeline)
]]>The viewport-result connects the transformation inside the
viewport back into the source document over which viewport is
operating. The transformed document always appears on a port named
'result'. Any other outputs are simply sequences analagous to
p:for-each. It's a static error to name one of those outputs 'result'.
Provide a way to specify the base URI of a documentThe base URI of a document created by the
p:inline element is the base URI of the p:inline element.
Specifying an xml:base attribute on the root
element of the document does not help as that only applies to that element and
its decendants.Additionally, in some pipelines, it is desirable to be able to change the
base URI of documents produced by other steps. No convenient mechanism exists in
XProc V1.0 to satisfy these requirements.ReferencesRFC 2119Key words for use in RFCs to Indicate Requirement Levels.
S. Bradner.
Network Working Group, IETF,
Mar 1997.
XProcXML:
An XML Pipeline Language. Norman Walsh, Alex Milowski, and Henry S. Thompson,
editors. W3C Recommedation 11 May 2010.
XDMXQuery and XPath Data Model (XDM) 3.0,
Norman Walsh, John Snelson, Editors.
World Wide Web Consortium,
08 January 2013.XQuery 3.0XQuery 3.0: An XML Query Language,
Jonathan Robie, Don Chamberlin, Michael Dyck, John Snelson, Editors.
World Wide Web Consortium,
08 January 2013.XSLT 3.0XSL Transformations (XSLT) Version 3.0,
Michael Kay, Editor.
World Wide Web Consortium,
10 July 2012.