Canonical XML Version 2.0

1 Introduction

1.1 Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [Keywords].

See [Namespaces] for the definition of QName.

document subset: A document subset is a portion of an XML document that may not include all of the nodes in the document.
canonical form: The canonical form of an XML document is physical representation of the document produced by the method described in this specification
canonical XML: The term canonical XML refers to XML that is in canonical form. The XML canonicalization method is the algorithm defined by this specification that generates the canonical form of a given XML document or document subset. The term XML canonicalization refers to the process of applying the XML canonicalization method to an XML document or document subset.
subtree: Subtree refers to one XML element node, and all that it contains. In XPath terminology it is an element node and all its descendant nodes

1.2 Applications

Since the XML 1.0 Recommendation [XML] and the Namespaces in XML 1.0 Recommendation [Namespaces] define multiple syntactic methods for expressing the same information, XML applications tend to take liberties with changes that have no impact on the information content of the document. XML canonicalization is designed to be useful to applications that require the ability to test whether the information content of a document or document subset has been changed. This is done by comparing the canonical form of the original document before application processing with the canonical form of the document result of the application processing.

For example, a digital signature over the canonical form of an XML document or document subset would allow the signature digest calculations to be oblivious to changes in the original document's physical representation, provided that the changes are defined to be logically equivalent by the XML 1.0 or Namespaces in XML 1.0. During signature generation, the digest is computed over the canonical form of the document. The document is then transferred to the relying party, which validates the signature by reading the document and computing a digest of the canonical form of the received document. The equivalence of the digests computed by the signing and relying parties (and hence the equivalence of the canonical forms over which they were computed) ensures that the information content of the document has not been altered since it was signed.

Note: Although not stated as a requirement on implementations, nor formally proved to be the case, it is the intent of this specification that if the text generated by canonicalizing a document according to this specification is itself parsed and canonicalized according to this specification, the text generated by the second canonicalization will be the same as that generated by the first canonicalization.

1.3 Limitations

1.4 Requirements for 2.0

XML Canonicalization 2.0 solves most of the major issues that have been identified by implementers with Canonical XML 1.0 [C14N10] and 1.1 [C14N11].

1.4.1 Performance

A major factor in performance issues noted in XML Signature is often C14N11 canonicalization. Canonicalization will be slow if the implementation uses the Canonical XML 1.1 specification as a formula without any attempt at optimization. This specification rectifies this problem by incorporating lessons learned from implementation into the specification. Most mature C14N implementations solve the performance problem by inspecting the signature first, to see if it can be canonicalized using a simple tree walk algorithm whose performance is similar to regular XML serialization. If not they fall back to the expensive nodeset based algorithm.

The use cases that cannot be solved by the simple tree walk algorithm are mostly edge use cases. This specification restricts the input of the canonicalization algorithm, so that implementations can always use the simple tree walk algorithm.

C14N 1.x uses an "XPath 1.0 Nodeset" to describe a document subset. This is the root cause of the performance problem and can be solved by not using a Nodeset. This version of the spec does not use a nodeset, visits each node exactly once, and it only visits the nodes that are being canonicalized.

1.4.2 Streaming

A streaming implementation is required to be able to process very large documents without holding it all in memory, i.e. it should be able to process the document one chunk at a time.

1.4.3 Robustness

Whitespace handling was a common cause of signature breakages. XML libraries allow one to "pretty print" an XML document, and most people wrongly assume that the white space introduced by pretty printing will be removed by canonicalization but that is not the case. This specification adds three techniques to improve robustness:

Remove leading and trailing whitespace from text nodes,
Allow for QNames in content especially in the xsi:type attribute,
Rewrite prefixes

1.4.4 Simplicity

C14N 1.x algorithms are complex and depend a full XPath library. This makes it very hard for scripting languages to use XML Signatures. This specification addresses this issue by not using the complex nodeset model, and therefore not relying completely on XPath - also it introduces a minimal canonicalization mode.

2 XML Canonicalization

2.1 Data Model

The input to the canonicalization algorithm consists of an XML document subset, and set of options. The XML document subset can be expressed in two ways, with a DOM model or a Stream model.

In a DOM model the XML subset is expressed as

either a whole document, or a list of one or more disjoint subtrees.
a list of exclusion subtrees or exclusion attribute nodes. (Note: this model purposely does not support re-inclusion, i.e. all the exclusions are applied after all the inclusions. So this is not like the XPath Filter 2 model [XPath-Filter-2] where there is an ordered list of union, intersect and subtract operations)

Note: exclusion is very limited, only complete subtrees and attribute nodes can be excluded, other kinds of nodes like text nodes, comment nodes, PI nodes cannot be excluded. Even attribute exclusion has limitations, namespace declaration and attributes in XML namespace cannot be excluded.

Note: This input model is a very limited form of the generic XPath Nodeset that was the input model for Canonical XML 1.x. It is designed to be simple and allow a high performance algorithm, while still allowing the essential use cases. Specifically this model does not allow these kinds of document subsets

an attribute all by itself
an attribute in the document subset, without its owner element being also in the document subset
a text node all by itself
a text node in the document subset, without its parent text node being also in the document subset
an element without some of its text node children

2.2 Parameters

Instead of separate algorithms for each variant of canonicalization, this specification goes with the approach of a single algorithm, which does slightly different things depending on the parameters.

Name	Values	Description	Default
exclusiveMode	true or false	whether to do inclusive or exclusive dealing of namespaces. In exclusive mode the inclusiveNamespacePrefixList parameter can be specified listing the prefixes that are to be treated in an inclusive mode	false
inclusiveNamespacePrefixList	space separated list of prefixes	list of prefixes to be treated inclusively. Special token #default indicates the default namespace.	empty
ignoreComments	true or false	whether to ignore comments during canonicalization	true
trimTextNodes	true or false	whether to trim (i.e. remove leading and trailing whitespaces) all text nodes when canonicalizing. Adjacent text nodes must be coalesced prior to trimming. If an element has an xml:space="preserve" attribute, then text nodes descendants of that element are not trimmed regardless of the value of this parameter.	false
serialization	XML or EXI	whether to do the normal XML serialization, or do an EXI serialization - which is useful if the original document to be signed is already in EXI format.	XML
prefixRewrite	none, sequential, derived	with none, prefixes are not changed, with sequential prefixes are changed to n1, n2, n3 ... and with derived, each prefix is changed to nSuffix, where the suffix is derived by doing a digest of the namespace URI.	none
sortAttributes	true or false	whether the attributes need to be sorted before canonicalization. In some environments the order of attributes changes in transit so sorting is important.	true
ignoreDTD	true or false	if set to true, ignore the DTD completely, which means do not normalize attributes, do not look into entity definitions, do not add default attributes to each element	false
expandEntities	true or false	if set to true ignore all entity declarations, and expand only the predefined entites (lt, gt, amp, apos, quot) and character references. (Entity declarations are potential attack points, [BradHill] mentions an entity that is 2 GB is length, also expanding external entities can lead to cross site scripting attacks)	true
xmlBaseAncestors	inherit, none, combine	whether to inherit xml:base attributes from ancestors (like C14N 1.0) or not (like Exc C14n 1.0) or combine them (like C14n 1.1)	combine
xmlIdAncestors	inherit, none	whether to inherit xml:id attributes from ancestors (like C14N 1.0) or not (like C14N 1.1 or Exc C14n 1.0)	none
xmlLangAncestors	inherit, none	whether to inherit xml:lang attributes from ancestors (like C14N 1.0 and C14n 1.1) or not (Exc C14n 1.0)	inherit
xmlSpaceAncestors	inherit, none	whether to inherit xml:space attributes from ancestors (like C14N 1.0 and C14n 1.1) or not (Exc C14n 1.0)	inherit
xsiTypeAware	true or false	if set to true, looks for namespace prefix usages in xsi:type attributes as well, otherwise xsi:type attributes are treated just like regular attributes.	false

The defaults are set to result in canonical 1.1 with no comments.

Implementation are not required to support all possible combinations of these parameters, instead these parameter are grouped into various "named parameter sets". Implementation can choose to support one or more of these.

canonical-xml-1.1-nocomments: exclusiveMode=false, xsiTypeAware=false ...

This produces the exactly same output as Canonical XML 1.1
exclusive-canonical-xml-1.0-nocomments: exclusiveMode=true, xsiTypeAware=false ...

This produces the exactly same output as Exc Canonical XML 1.0
minimal-canonicalization:sortAttributes=false,...

Very low processing, required in situations where the XML content is expected to be mostly unchanged during transport

2.3 Processing Model for DOM

The basic canonicalization process consist of traversing the tree and outputting octets for each node. The algorithm here is presented in pseudo-code using a recursive function to traverse the tree.

Sort the subtrees by document order, and then start processing each subtree.

canonicalize(list of subtree, list of exclusion elements and attributes, properties)
{
   put the exclusion elements and attributes in hash table for easier lookup
   
   sort the multiple subtrees by document order
   
   for each subtree
      canonicalizeSubtree(subtree) 
}

Note: these subtrees should be distinct, i.e. one subtree should not include any of the other subtrees. if that is not the case, ignore the included subtrees

For the special case when the subtree is actually the whole document, or the document root, directly start processing the node. Otherwise find out a list of ancestors for that subtree, and then look for namespace declarations in this ancestor nodes. Also look for any xml: attributes that need to be inherited, and then temporarily put them in the subtree root, and then start processing the subtree root.

canonicalizeSubtree(node)
{
   initialize namespaceContext to contain the default prefix, mapped
   to an empty URI, and hasBeenOutput to true 
   
   if (node is the document node or a document root element) 
   {
      // (whole document is being processed, no ancestors to worry about)
      call processNode(node, namespaceContext)
   }
   else
   {
      starting from the element, walk up the tree to collect a list of
      ancestors 
        
      for each of this ancestor elements starting with the document
      root, but not including the element itself 
        addNamespaces(ancestorElem, namespaceContext)

      initialize xmlattribContext to empty

      for each of this ancestor elements starting with the document
      root, and also including the element itself 
        addXmlattribs(ancestorElem, xmlattribContext)
          
      if there are any attributes in xmlattribContext 
         temporarily add/replace these XML attributes in node
          
      processNode(node, namspaceContext)
          
      restore the original XML attributes
   }   
}

2.3.1 XML Attribute Processing

Special processing is required for xml:id, xml:lang, xml:space and xml:base attributes. To process this keep a hash table of attribute name to attribute value.

xmlattribContext is a hash table of  name -> value

While processing the ancestors of each subtree, these special XML attributes need to inherited, combined or ignored depending on the parameters.

addXMLAttribute(element, xmlattribContext)
{
   for each of the xml: attributes of this element
   {
      case xml:id attribute: 
        if xmlIdAncestors is inherit then store this attribute value, else do nothing

      case xml:lang attribute 
        if xmlLangAncestors is inherit then store this attribute value, else do nothing

      case xml:space attribute 
        if xmlSpaceAncestors is inherit then store this attribute value, else do nothing

      case xml:base attribute 
        if xmlBaseAncestors is inherit then store this attribute value,
        else if xmlBaseAncestors is combine, and there is a previous value of xml:base
           then do a "join-URI-References" to combine the new value and the old value 
        else do nothing
   } 
}

2.3.1.1 join-URI-References function

The join-URI-References function takes xml:base attribute values from all the ancestor elements and combines it to create a value for an updated xml:base attribute. A simple method for doing this is similar to that found in sections 5.2.1, 5.2.2 and 5.2.4 of RFC 3986 with the following modifications:

Perform RFC 3986 section 5.2.1. "Pre-parse the Base URI" modified as follows.
- The scheme component is not required in the base URI (Base). (i.e. Base.scheme may be null)
- Replace a trailing ".." segment with "../" segment before processing.
Section 5.2.4. "Remove Dot Segments" is modified as follows:
- Keep leading "../" segments
- Replace multiple consecutive "/" characters with a single "/" character.
- Append a "/" character to a trailing ".." segment
The "Remove Dot Segments" algorithm is modified to ensure that a combination of two xml:base attribute values that include relative path components (i.e., path components that do not begin with a '/' character) results in an attribute value that is a relative path component.
Perform RFC 3986 section 5.2.2. "Transform References" modified as follows to ignore the fragment part of R
- After parsing R set R.fragment = null

The following examples illustrate the modification of the "Remove Dot Segments" algorithm:

"abc/" and "../" should result in ""
"../" and "../" are combined as "../../" and the result is "../../"
".." and ".." are combined as "../../" and the result is "../../"

2.3.2 Node Processing

The following pseudo code use a recursive function processNode(node) to traverse the tree.

Generic node: Redirect to appropriate node processing function

processNode(node, namespaceContext)
{
  call the appropriate function - processElement, processTextNode, ... 
}

Document node: Loop through all the children

processDocument(document, namespaceContext)
{
  Loop through all child nodes and call
    processNode(child, namespaceContext)
}

Element nodes First check if this matches an exclusion node, in which case completely ignore this element and all its descendants. Otherwise process the namespaces as described in the next section. This will return a list of namespaces to be output and also compute the rewritten prefix value. Now output the element start tag, then the list of namespaces and then the list of attributes. After that loop through all the children, and then output the element end tag.
```
processElement(element, namespaceContext)
{
  if this exists in the exclusion hash table
    return
    
  make of copy of xmlattribContext and namespaceContext
  //(by copying, any changes made can be undone when this function returns)
  
  nsToBeOutputList = processNamespaces(element, namespaceContext)
  
  output('<')
  output(element QName)  

  for each of the namespaces in the nsToBeOutputList
    output this namespace declaration 
    
  sort each of the non namespaces attributes by URI first then attribute name.
  output each of these attributes

  output('>')
  
  Loop through all child nodes and call
    processNode(child, namespaceContext)
  
  output('</')
  output(element QName)
  output('>')
  
  restore xmlattribContext and namespaceContext
}
```
Note: Take special care when rewritePrefix parameter is set. In that case use the new prefix value for all QNames, element names, attribute names, and also QNames in xsi:type attributes.
Text nodes: Ignore text nodes outside document root. For text nodes inside the document root replace special characters. Also if the trimTextNode is set to true, and there is no xml:space="preserve" declaration trim leading and trailing space.
```
processTextNode(textNode)
{
  if this text node is outside document root
     return
     
  in the text replace 
    all ampersands by &amp;, 
    all open angle brackets (<) by &lt;, 
    all closing angle brackets (>) by &gt;, 
    and all  #xD characters by &#xD;.
    
  If trimTextNode is true and there is no xml:space=preserve declaration in scope
    trim leading and trailing space
      
  output(text)
}                                       
```
Note: The DOM parser might have split up a long text node into multiple adjacent text nodes, some of which may be empty. In that case be careful when trimming the leading and trailing space - the net result should be same as if it the adjacent text nodes were concatenated into one
Processing Instruction (PI) Nodes: If the string value is empty, do not add the leading space is not added. Also, output a trailing #xA is rendered after the closing PI symbol for PI children of the root node which are before the document element, and output a leading #xA before the opening PI symbol of PI children of the root node which are after the document element.
```
processPINode(piNode)
{
  if before document node
    output('#xA')
    
  output('<?')
  output(the PI target name of the node)
  output(a leading space)
  output(the PI string value)
  output('?>') 

  if after document node
    output('#xA')
}                                        
```
Comment Nodes: Output nothing uf generating canonical XML without comments. For canonical XML with comments, generate the opening comment symbol (). Also, output a trailing #xA after the closing comment symbol for comment children of the root node which are before the document element, and output a leading #xA before the opening comment symbol of comment children of the root node which are after the document element. (Comment children of the root node represent comments outside of the top-level document element and outside of the document type declaration).
```
processCommentNode(commentNode)
{
  if ignoreComments
    return
    
  if before document node
    output('#xA')
    
  output('')

  if after document node
    output('#xA')
}
```

2.3.3 Namespace Processing

Explicit and Implicit namespace declarations In DOM, there is no special node for namespace declarations, they are just present as regular Attribute nodes, whose prefix is "xmlns" and whose locaName is the prefix begin declared. DOM also allows declaring a namespace "implicitly", i.e. if a new DOM element or attribute is constructed using the createElementNS and createAttributeNS methods, but there is no declaration for that prefix, the declaration is automatically added when serializing the document.
Default namespace The default namespace is declared by xmlns="...". If such a declaration does not exist, it means that default namespace is null.
Visibility utilized This concept is required for exclusive canonicalization. A namespace prefix is visibly utilitized by an element when
- The element itself uses the prefix. (Note if an element does not have a prefix, that means it visibily utilizes the default namespace.)
- An attribute of that element uses that prefix, and that attribute is not in the exclusion list. (Note: unlike elements, if an attribute doesn't have a prefix, its means it is a locally scoped attribute. It does NOT mean that the attribute visibily utilizes the default namespace.)
- xsiTypeAware is true, and the element has an xsi:type attribute, and this attribute's value uses this prefix.
Namespace context
```
namespaceContext is a hash table of  prefix -> (uri, hasBeenOutput, newPrefix)
```
While traversing the subtrees, maintain a "namespace context" which is mapping of prefixes to URIs. Each prefix should also have
- a boolean flag hasBeenOutput - whether tha namespace declaration has been output
- a new prefix value - used for prefix rewriting.
At the beginning of the canoncalization initialize this to contain only entry - the default namespace mapped to an empty URI, and hasBeenOutput = true. A prefix value of null can be used to denote the default namespace.

This function is called for every ancestor element, and also at every element of the subtrees (minus the exclusion elements). It adds the namespaces declaration to this namespaceContext.

addNamespaces(element, namespaceContext)
{
  for each the explicit and implicit namespace declarations in the element
  {
     if there is already a declaration for this prefix, and this
     declaration is different from existing declaration 
     overwrite the URI , and set hasBeenOutput to false
      
     if there is no entry for this prefix
     add an entry for this URI, and hasBeenOutout to false
         
  } 
}

At every element of the subtree (minus the exclusion subtrees), compute a list of namespaces that need be output. In inclusive mode output the namespace declaration right away, but in exclusive delay outputting till the namespace prefix is visibily utilized. After computing the list of namespaces to be output do prefix rewriting.

If prefixRewrite is none, just sort the namespaces to be output by prefix name, default prefix is an empty string so if present it goes first
If prefixRewrite is sequential, sort the nameapces to be output by URI. Then sequentially assign them prefixes n0, n1, n2 .... For this keep a counter variable that is initialized to 0 at the beginning of the canonicalization and then incremented to get the prefixes.
If prefixRewrite is digest, sort the namespaces to be output by URI. Then assign them a prefixes based on a SHA1 digest of the URI, which is then base64 ed, and base64 chars '/' and '+' replaced by '_' and '-' to achieve XML name rules.

processNamespaces(element, namespaceContext)
{
  addNamespaces(element, namespaceContext)
  
  initialize nsToBeOutputList to empty list
  
  for each prefix in the namespaceContext for which hasBeenOutput is false
  {
     if exclusiveMode and this prefix is not in the inclusiveNamespacesList
     {
        if the prefix is visibily utilized by this element
                add the prefix to the nsToBeOutputList and set
            hasBeenOutput to true 
     }
     else
                add the prefix to the nsToBeOutputList and set hasBeenOutput to true    
  }
  
  if (prefixRewrite is none)
  {
    sort the nsToBeOutputList by the prefix
  }
  else if (prefixRewrite is sequential) 
  {
    sort the nsToBeOutputList by URI
    assign new prefix values "nN" to each prefix in this
    nsToBeOutputList where N represents an incremented counter value ,
    i.e. n0, n1, n2 .. 
    // the counter should be set to 0 in the beginning of the canonicalization
    // note: prefix numbers are assigned in the order that the
    prefixes are present in nsToBeOutputList 
  }
  else if (prefixRewrite in digest)
  {
    sort the nsToBeOutputList by URI
    assign new prefix values "nD" to each prefix in this nsToBeOutputList where
      D represents the SHA1 digest of the URI represented as a Base64
      string 
    // refer to presentation by Ed Simon  
  }
  
  return nsToBeOutputList    
}

2.3.4 Output rules

The document is encoded in UTF-8
Line breaks normalized to #xA on input (automatically done by a DOM parser)
Attribute values are normalized, if ignoreDTD is false
Character and parsed entity references are replaced
CDATA sections are replaced with their character content
The XML declaration and document type declaration are removed
Empty elements are converted to start-end tag pairs
Whitespace outside of the document element and within start and end tags is normalized
Attribute value delimiters are set to quotation marks (double quotes)
Special characters in attribute values and character content are replaced by character references
Default attributes are added to each element, if ignoreDTD is false

2.3.5 Other ideas considered

Qnames in content: Have another parameter listing other element / attribute names that can have QNames, besides xsi:type. Or simply search all text content for QName.
Significant white space: Have a parameters listing elements in which whitespace is significant. Instead of listing individual element names, and entire target namespace URI can be specified, e.g. in many elements in xhtml namespace whitespace is significant

2.4 Processing model for Streaming XML parsers

Unlike DOM parsers which represent XML document as a tree of nodes, streaming parsers represent an XML document as stream of events like "start-element", "end-element", "text" etc. A document subset can also be represented as a stream of events. This stream of events in exactly in the same order as a tree walk, so the above canonicalization algorithm can be also used to canonicalize an event stream.

3 References

C14N-20000119: Canonical XML Version 1.0, W3C Working Draft. T. Bray, J. Clark, J. Tauber, and J. Cowan. January 19, 2000. http://www.w3.org/TR/2000/WD-xml-c14n-20000119.html.
C14N-Issues: Known Issues with Canonical XML 1.0, W3C Working Group Note. J. Kahan, K. Lanz. December 2006. http://www.w3.org/TR/C14N-issues/.
C14N10: Canonical XML Version 1.0, W3C Recommendation. ed. J. Boyer. 15 March 2001.http://www.w3.org/TR/xml-c14n.
C14N11: Canonical XML Version 1.1, W3C Recommendation. ed. J. Boyer, G. Marcy. 2 May 2008.http://www.w3.org/TR/xml-c14n11/.
CharModel: Character Model for the World Wide Web, W3C Working Draft. eds. Martin J. Dürst, François Yergeau, Misha Wolf, Asmus Freytag and Tex Texin. http://www.w3.org/TR/charmod/.
CowanExample: Example of Harmful Effect of Character Model Normalization , Letter in XML Signature Working Group Mail Archive. John Cowan, July 7, 2000. http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2000JulSep/0038.html.
DSig-Usage: Using XML Digital Signatures in the 2006 XML Environment , W3C Working Group Note. Thomas Roessler. December 2006. http://www.w3.org/TR/DSig-usage/.
ISO-8859-1: ISO-8859-1 Latin 1 Character Set. http://www.utoronto.ca/webdocs/HTMLdocs/NewHTML/iso_table.html or http://www.iso.org/iso/iso_catalogue.htm.
Infoset: XML Information Set, W3C Working Draft. eds. John Cowan and Richard Tobin. http://www.w3.org/TR/xml-infoset/.
Keywords: Key words for use in RFCs to Indicate Requirement Levels, IETF RFC 2119. S. Bradner. March 1997. http://www.ietf.org/rfc/rfc2119.txt.
NFC: TR15, Unicode Normalization Forms. M. Davis, M. Dürst. Revision 18: November 1999. http://www.unicode.org/unicode/reports/tr15/tr15-18.html.
NFC-Corrigendum: > Normalization Corrigendum. The Unicode Consortium. http://www.unicode.org/unicode/uni2errata/Normalization_Corrigendum.html.
Namespaces: Namespaces in XML 1.0 (Second Edition), W3C Recommendation. eds. Tim Bray, Dave Hollander, Andrew Layman, and Richard Tobin. http://www.w3.org/TR/REC-xml-names/.
URI: Uniform Resource Identifiers (URI): Generic Syntax, IETF RFC 3986. T. Berners-Lee, R. Fielding, L. Masinter. January 2005 http://www.ietf.org/rfc/rfc3986.txt.
UTF-16: UTF-16, an encoding of ISO 10646, IETF RFC 2781. P. Hoffman , F. Yergeau. February 2000. http://www.ietf.org/rfc/rfc2781.txt.
UTF-8: UTF-8, a transformation format of ISO 10646, IETF RFC 2279. F. Yergeau. January 1998. http://www.ietf.org/rfc/rfc2279.txt.
Unicode: The Unicode Standard, version 3.0. The Unicode Consortium. ISBN 0-201-61633-5. http://www.unicode.org/unicode/standard/versions/Unicode3.0.html.
XBase: XML Base ed. Jonathan Marsh. 27 June 2001. http://www.w3.org/TR/xmlbase/.
XML: Extensible Markup Language (XML) 1.0 (Fourth Edition), W3C Recommendation. eds. Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, François Yergeau and Eve Maler. 16 August 2006. http://www.w3.org/TR/REC-xml/.
XML ID: xml:id Version 1.0, W3C Recommendation. eds. Norman Walsh, Daniel Veillard and Jonathan Marsh. 9 September 2005. http://www.w3.org/TR/xml-id/.
XML Plenary Decision: W3C XML Plenary Decision on relative URI References In namespace declarations, W3C Document. 11 September 2000. http://lists.w3.org/Archives/Public/xml-uri/2000Sep/0083.html.
XML-DSig: XML-Signature Syntax and Processing, IETF Draft/W3C Candidate Recommendation. D. Eastlake, J. Reagle, D. Solo, M. Bartel, J. Boyer, B. Fox, and E. Simon. 31 October 2000. http://www.w3.org/TR/xmldsig-core/.
XMLDSIG2: XML Signature Syntax and Processing, Version 2.0, W3C Working Draft 22 October 2009. http://www.w3.org/TR/2009/WD-xmldsig-core2-20091022/
XMLDSIG2nd: XML Signature Syntax and Processing (Second Edition), W3C Recommendation 10 June 2008 http://www.w3.org/TR/2008/REC-xmldsig-core-20080610/
XPath: XML Path Language (XPath) Version 1.0, W3C Recommendation. eds. James Clark and Steven DeRose. 16 November 1999. http://www.w3.org/TR/1999/REC-xpath-19991116.
XPath-Filter-2: XML-Signature XPath Filter 2.0. W3C Recommendation. J. Boyer, M. Hughes, J. Reagle. November 2002. http://www.w3.org/TR/2002/REC-xmldsig-filter2-20021108/

Input	Output
no/.././/pseudo-netpath/seg/file.ext	pseudo-netpath/seg/file.ext
no/..//.///pseudo-netpath/seg/file.ext	pseudo-netpath/seg/file.ext
yes/no//..//.///pseudo-netpath/seg/file.ext	yes/pseudo-netpath/seg/file.ext
no/../yes	yes
no/../yes/	yes/
no/../yes/no/..	yes/
../../no/../..	../../../
no/../..	../
no/..
no/../
/a/b/c/./../../g	/a/g
mid/content=5/../6	mid/6
../../..	../../../
no/../../	../
..yes/..no/..no/..no/../../../..yes	..yes/..yes
..yes/..no/..no/..no/../../../..yes/	..yes/..yes/
../..	../../
../../../	../../../
.
./
./.
//no/..	/
../../no/..	../../
../../no/../	../../
yes/no/../	yes/
yes/no/no/../..	yes/
yes/no/no/no/../../..	yes/
yes/no/../yes/no/no/../..	yes/yes/
yes/no/no/no/../../../yes	yes/yes
yes/no/no/no/../../../yes/	yes/yes/
/no/../	/
/yes/no/../	/yes/
/yes/no/no/../..	/yes/
/yes/no/no/no/../../..	/yes/
../../..no/..	../../
../../..no/../	../../
..yes/..no/../	..yes/
..yes/..no/..no/../..	..yes/
..yes/...no/..no/..no/../../..	..yes/
..yes/..no/../..yes/..no/..no/../..	..yes/..yes/
/..no/../	/
/..yes/..no/../	/..yes/
/..yes/..no/..no/../..	/..yes/
/..yes/..no/..no/..no/../../..	/..yes/
/	/
/.	/
/./	/
/./.	/
/././	/
/..	/
/../..	/
/../../..	/
/../../..	/
//..	/
//..//..	/
//..//..//..	/
/./..	/
/./.././..	/
/./.././.././..	/
.
./
./.
..	../
../	../