Copyright © 2010 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document defines a streamable profile of XPath 1.0 suitable for use with XML Signature 2.0.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is a W3C Working Draft of "XML Signature Streaming Profile of XPath 1.0".
This document is derived from material in the previous publication of XML Signature 2.0, see http://www.w3.org/TR/2010/WD-xmldsig-core2-20100304/#sec-XPath-2.0.
This document was published by the XML Security Working Group as a First Public Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-xmlsec@w3.org (subscribe, archives). All feedback is welcome.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document specifies a streamable profile of XPath 1.0 [XPATH] for use in XML Signature 2.0 [XMLDSIG-CORE2]. The motivation for introducing this profile is outlined here.
XML Signature lets one sign parts of the XML document. In 1.x version of XML Signature [XMLDSIG-CORE1], the part to be signed can be identified in one of the following ways:
ID based references This is the simplest and the most popular mechanism. An ID attribute is added to the element to be signed, and the signature refers to this element by this ID. However is has certain problems:
XPath Filter Transform This is the original XPath mechanism in XML Signature 1.0. It solves the three problems mentioned above, but it introduces new problems:
//chapter/
(i.e. all chapter
)
descendants has to be expressed as ancestor-or-self::chapter
(i.e. a boolean expression which evaluates
to true for a node, if that node has an ancestor call chapter
). Not only is this very hard to understand
, but also some XPaths cannot be expressed like this at all. For example it is not possible to express
/book/chapter[3]
in this model.XPath Filter 2 Transform This was introduced to solve the problems in above problems [XMLDSIG-XPATH-FILTER2]. However it has a few problems of its own:
XPointer At the time the XML Signature 1.0 was written, XPointer supported a full XPath model,
but it was still under development. Later on it split up into multiple specs, the full XPath support remained as a Working draft [XPTR-XPOINTER],
and only the XPointer element scheme [XPTR-ELEMENT] became an official recommendation. The XPointer
element scheme does not support generic XPaths, it only
allows basic addressing of XML elements e.g. element(/1/2)
identifies the 2nd child of the root element.
XML Signature 2.0, retains all 1.x mechanisms for backwards compatibility, but it introduces a new mechanism
<ds2:Selection>
which can be viewed as a very simplified form of XPath Filter 2 Transform. It
consists of URI
which can be used for ID based references and an IncludedXPath
and ExcludedXPath
for inclusions and exclusions. The result of the selection is subtrees identified by included XPath, minus the subtrees
identified by excluded XPath. These IncludedXPath
and ExcludedXPath
take a profile of XPath, and
and it is this profile that this document describes.
This 2.0 selection mechanism, has all the advantages of XPath Filter 2 transform, and additionally it is designed to support streaming. However it does the restrict the kind of selection - only subtrees with one round of subtree exclusion can be selected. This restriction is required for high performance.
The XPath profile defined in this document is one-pass streamable. This means
It should be possible to evaluate this XPath on large XML documents without having having to load the entire document into memory. The implementation should read the XML one chunk at a time, and do a single forward only pass over the document.
More specifically, the XPath engine should work off a streaming XML parser. A streaming parser is a software module that reads the XML document, and constructs a stream of XML events like "beginElement", "text", "endElement" etc. [StAX] is an example of a streaming parser. At any point the streaming parser only has the current event in memory. The XPath engine needs to evaluate the XPath based on this stream of events, so it can only work off the current event, and some limited state e.g. the ancestors of this element. This means that the XPath should be evaluatable with this limited information.
The XPath itself might select a large portion of the document, or even the whole document. The result of the XPath evaluation should not be loaded all into memory if it is large. Instead it should be pipelined to the processing stage, which in case of XML signature 2.0 is canonicalization. i.e. an XML Signature 2.0 implementation should be able to select portions of the XML document using XPath, canonicalize the selected sections and then compute a digest, all in a pipeline with a limited amount of memory.
The XML document may have very large text nodes. The XPath engine should not be required to load such large nodes in their entirety.
In the context of XML Signature, streaming is absolutely essential for network appliances which need to perform XML Signature and Encryption operations for messages on the wire. It is also important for performance sensitive application as streaming, by conserving memory, greatly reduces memory allocation, deallocation and garbage collection calls, thereby improving performance.
Apart from streaming, the XPath profile also needs to satisfy the following requirements:
The profile should produce results that are compatible with the C14N 2.0 data model. i.e it should only result in element nodes or attribute nodes (but not xml: attribute and namespace attributes).
This XPath profile should include some of the known usages of XPath in XML Signatures.
@SOAP:actor
attribute matches
a certain value. Refer section 4.1.3 of [EBXML-MSG]GovTalkMessage/Body
subtree, but exclude the
GovTalkMessage/Body/IRevenvelope/IRHeader/IRmark
subtree. [HMRMC]
The following table defines this XPath profile. It is expressed as a diff of the XPath 1.0 grammar, and the rule numbers here match the rule numbers
in the XPath 1.0 grammar [XPATH]. Insertions are
underlined and deletions are in strikeout.
Grammar |
Explanation |
[0a] IncludedXPath ::= ( LocationPath '|' )* LocationPath [0b] ExcludedXPath ::= ( LocationPath '|' )* LocationPath |
The Included and Excluded XPath do not use the generic XPath Expr. Instead they are just a union of LocationPath. There is a slight difference between IncludedXPath and ExcludedXPath, ExcludedXpath can select attributes and element, whereas IncludedXPath can only select elements. |
[1] LocationPath ::= |
RelativeLocationPath is not allowed, only Absolute is allowed. This is because relative would be most useful in Enveloped signatures in which the signature is contained inside the element to be signed. In such a situation, the relative signature XPath would need to use the ancestor axis to reach the element to be signed, however the ancestor axis is not streamable. |
[4] Step ::= AxisSpecifier NodeTest Predicate* | AbbreviatedStep [4a] StepAttributeOnly ::= 'attribute' '::' NameTest | '@' '::' NameTest |
Added a new versions of Step, which can contain a attribute only.
e.g. in this XPath expression:
|
[5] AxisSpecifier ::= AxisName '::' | AbbreviatedAxisSpecifier |
unchanged |
[6] AxisName ::= |
All the non streamable axes have been removed - |
[7] NodeTest ::= NameTest |
|
[8] Predicate ::= '[' PredicateExpr ']' [9] PredicateExpr ::= Expr |
unchanged But the definition of Expr has changed, so it is only a additive/relative expressions of StepAttributeOnly and Literals. |
[10] AbbreviatedAbsoluteLocationPath ::= '//' RelativeLocationPath [11] AbbreviatedRelativeLocationPath ::= RelativeLocationPath '//' Step [12] AbbreviatedStep ::= '.' | '..' [13] AbbreviatedAxisSpecifier ::= '@'? |
unchanged |
[14] Expr ::= OrExpr [15] PrimaryExpr ::= VariableReference | '(' Expr ')' | Literal | Number | FunctionCall |
unchanged |
[16] FunctionCall ::= FunctionName '(' ( Argument ( ',' Argument )* )? ')' [17] Argument ::= Expr |
unchanged |
|
UnionExpr, PathExpr and FilterExpr have been removed. |
[21] OrExpr ::= AndExpr | OrExpr 'or' AndExpr [22] AndExpr ::= EqualityExpr | AndExpr 'and' EqualityExpr [23] EqualityExpr ::= RelationalExpr | EqualityExpr '=' RelationalExpr | EqualityExpr '!=' RelationalExpr [24] RelationalExpr ::= AdditiveExpr | RelationalExpr '<' AdditiveExpr | RelationalExpr '>' AdditiveExpr | RelationalExpr '<=' AdditiveExpr | RelationalExpr '>=' AdditiveExpr |
unchanged |
[25] AdditiveExpr ::= MultiplicativeExpr | AdditiveExpr '+' MultiplicativeExpr | AdditiveExpr '-' MultiplicativeExpr [26] MultiplicativeExpr ::= UnaryExpr | MultiplicativeExpr MultiplyOperator UnaryExpr | MultiplicativeExpr 'div' UnaryExpr | MultiplicativeExpr 'mod' UnaryExpr [27] UnaryExpr ::= |
The unaryExpr is changed to only allow a PrimaryExpr or StepAttributeOnly |
[28] ExprToken ::= '(' | ')' | '[' | ']' | '.' | '..' | '@' | ',' | '::' | NameTest | NodeType | Operator | FunctionName | AxisName | Literal | Number | VariableReference [29] Literal ::= '"' [^"]* '"' | "'" [^']* &uot;'" [30] Number ::= Digits ('.' Digits?)? | '.' Digits [31] Digits ::= [0-9]+ [32] Operator ::= OperatorName | MultiplyOperator | '/' | '//' | '|' | '+' | '-' | '=' | '!=' | '<' | '<=' | '>' | '>=' [33] OperatorName ::= 'and' | 'or' | 'mod' | 'div' [34] MultiplyOperator ::= '*' [35] FunctionName ::= QName - NodeType [36] VariableReference ::= '$' QName [37] NameTest ::= '*' | NCName ':' '*' | QName [38] NodeType ::= |
unchanged, expect for the NodeTest |
Node set functions
|
Note: A predicate's expression can only involve attribute nodes of the current element. Functions can also be used inside this expression, but this function's arguments also have to expressions involving attribute nodes. There is no way to use elements, text nodes comments and processing instructions in predicate expressions. The "string-value" of an attribute node, is the attributes value. The String, number and boolean functions are all supported. However the no argument forms of However the no argument forms of |
This sections explains the profile with some XPath expressions that are part of this profile and some that aren't.
All the XPath examples below are based on the following XML document.
<book> <foreword> </foreword> <chapter type="preface"> </chapter> <chapter> <title>Hybridism</title> </chapter> <chapter> </chapter> </book>
Examples of XPath expression that are included in the profile.
# | Example | Description | XML Dsig 2.0 |
---|---|---|---|
1 | /book/chapter |
all chapter children of book |
Y |
2 | /book/chapter[3] |
third chapter child of book |
Y |
3 | /book/chapter[@type="preface"] |
all chapter children of book that have a type attribute with value preface |
Y |
4 | /book/chapter[@type="preface"][1] |
the first chapter child of book that has a title attribute with
value preface. |
Y |
5 | /book/chapter[2]/title[1] |
the first title child of the second chapter child of book . |
Y |
6 | /book/chapter[contains(@type,"pre")] |
all chapter children of book that have an attribute
type whose value contains the string "pre". |
Y |
7 | /child::book/child::chapter[contains(attribute::type,"pre")] |
non abbreviated form of the above | Y |
8 | /book/chapter[1][local-name(@*) = "type"] |
the first chapter of book if its first attribute's local-name is "type". |
Y |
9 | /book/chapter[position() mod 2 != 0] |
chapter children of book whose position is an odd number,
i.e. the odd numbered chapters of the book. |
Y |
10 | /book/chapter[position() mod 2 != 0][@type="preface"] |
all the odd numbered chapters whose type is "preface". |
Y |
11 | //chapter |
all chapter descendants |
Y |
12 | /book/chapter | /book/foreword |
all the chapter children of book and all the foreword |
Y |
13 | //* |
all the element nodes in the document. | Y Note: when this is used to identify a selection in XML Dsig 2.0, it is exactly equivalent to "/*" which select only the document root element. |
These are examples of XPath expressions that are NOT included in the profile.
# | Example | Description | XML Dsig 2.0 |
---|---|---|---|
1 | /book/chapter[title="Hybridism"] |
all chapter children of book that have a title sub element with value Hybridism |
N expressions can only involve attributes |
2 | (/book)/chapter |
Evaluate the (/book) expression and set that to the context node, and get the chapter child of that context node. | N the top level expression cannot have parenthesis or any other operators except the union operator "|" |
3 | count(/book/chapter) |
count the number of chapter children of book |
N |
4 | chapter |
all chapter element children of context node. |
N relative location paths not allowed. |
5 | . |
The context node. | N relative location paths are not allowed. |
6 | /book/chapter/title/ancestor-or-self::chapter |
the chapter ancestor of /book/chapter/title . |
N Only child, descendant and self axes are allowed. |
7 | /book/chapter/title/text() |
the text child of /book/chapter/title . |
N text, comment and processing-instructions cannot be selected. |
8 | id("i1") |
elements that have ID, whose value is "i1".. | N ID will work only if there is a DTD. |
9 | /book[chapter/title] |
the book element if it has a chapter/title grandchild. |
N only attributes are allowed in predicates. |
10 | /book/*[local-name(self::node()) = "chapter"] |
the children of book element whose local name is "chapter". |
N Only attributes are allowed in predicates. |
11 | /book/chapter[2]/node() |
all the child elements of the second chapter of the book . |
N node() function is not allowed, because it can select text nodes too. |
12 | /book/chapter or /book/foreword |
boolean result is true, i.e. either of the location paths evaluate to non empty. | N or operator is not allowed at top level. Top level expression can only be union of location paths. |
This section outlines an algorithm for a Streaming XPath engine that can execute this XPath subset.
For parsing:
"|"
. i.e.
break up the locationPath | locationPath | ..
into individual
location paths./ step / step / step .. / step [
predicate ]
to get the steps and optional predicate. Two slashes
together indicates descendant axis.@a = "foo"
and @b > "bar"
You need to have an expression parsing and evaluating engine to do this.
For executing:
For each locationPath
, match up the step
s to the ancestor stack,
If they match, evaluate the predicate with the current element's
expression. If that passes too, this element and all its descendants are
included.
Dated references below are to the latest known or appropriate edition of the referenced work. The referenced works may be subject to revision, and conformant implementations may follow, and are encouraged to investigate the appropriateness of following, some or all more recent editions or replacements of the works cited. It is in each case implementation-defined which editions are supported.