W3C

Test cases for Canonical XML 2.0

W3C Working Group Note

This version:
http://www.w3.org/TR/2013/NOTE-xml-c14n2-testcases-20130618/
Latest published version:
http://www.w3.org/TR/xml-c14n2-testcases/
Latest editor's draft:
http://www.w3.org/2008/xmlsec/Drafts/c14n-20/test-cases/
Previous version:
http://www.w3.org/TR/2013/NOTE-xml-c14n2-testcases-20130411/
Editors:
Pratik Datta,
Frederick Hirsch,

Abstract

This document outlines test cases for Canonical XML 2.0 [XML-C14N20].

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document outlines test cases for Canonical XML 2.0 [XML-C14N20]. Changes since the previous publication include a correction to the text in section 3.4 Namespace Re-declarations (diff).

This document was published by the XML Security Working Group as a Working Group Note. If you wish to make comments regarding this document, please send them to public-xmlsec@w3.org (subscribe, archives). All comments are welcome.

Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1. Introduction

This document has various test cases for Canonical XML 2.0 [XML-C14N20]. All the test files are available in this directory: files.
The test cases are organized as follows: In the tables below the file contents are embedded in the table, but you can download the actual file by clicking on the hyperlink on the table header.

2. Examples from Canonical XML 1.0

The examples in this section assume a non-validating processor, primarily so that a document type declaration can be used to declare entities as well as default attributes and attributes of various types (such as ID and enumerated) without having to declare all attributes for all elements in the document. As well, one example contains an element that deliberately violates a validity constraint (because it is still well-formed).

2.1 PIs, Comments, and Outside of Document Element

Demonstrates:
Original Data After c14n After c14n with Comments
Example 1
<?xml version="1.0"?>

<?xml-stylesheet   href="doc.xsl"
   type="text/xsl"   ?>

<!DOCTYPE doc SYSTEM "doc.dtd">

<doc>Hello, world!<!-- Comment 1 --></doc>

<?pi-without-data     ?>

<!-- Comment 2 -->

<!-- Comment 3 -->
Example 2
<?xml-stylesheet href="doc.xsl"
   type="text/xsl"   ?>
<doc>Hello, world!</doc>
<?pi-without-data?>
Example 3
<?xml-stylesheet href="doc.xsl"
   type="text/xsl"   ?>
<doc>Hello, world!<!-- Comment 1 --></doc>
<?pi-without-data?>
<!-- Comment 2 -->
<!-- Comment 3 -->

2.2 Whitespace in Document Content

"After c14n" Demonstrates:

Note: For "After c14n", the input document and canonical form are identical. Both end with '>' character.

"After c14n with Trim whitespace" Demonstrates:
Original Data After c14n After c14n with Trim whitespace
Example 4
<doc>
   <clean>   </clean>
   <dirty>   A   B   </dirty>
   <mixed>
      A
      <clean>   </clean>
      B
      <dirty>   A   B   </dirty>
      C
   </mixed>
</doc>
Example 5
<doc>
   <clean>   </clean>
   <dirty>   A   B   </dirty>
   <mixed>
      A
      <clean>   </clean>
      B
      <dirty>   A   B   </dirty>
      C
   </mixed>
</doc>
Example 6
<doc><clean></clean><dirty>A   B</dirty><mixed>A<clean></clean>B<dirty>A   B</dirty>C</mixed></doc>

2.3 Start and End Tags

"After c14n" Demonstrates:

Note: Some start tags in the canonical form are very long, but each start tag in this example is entirely on a single line.

Note: In e5, b:attr precedes a:attr because the primary key is namespace URI not namespace prefix, and attr2 precedes b:attr because the default namespace is not applied to unqualified attributes (so the namespace URI for attr2 is empty).

Original Data After c14n After c14n with PrefixRewrite
Example 7
<!DOCTYPE doc [<!ATTLIST e9 attr CDATA "default">]>
<doc>
   <e1   />
   <e2   ></e2>
   <e3   name = "elem3"   id="elem3"   />
   <e4   name="elem4"   id="elem4"   ></e4>
   <e5 a:attr="out" b:attr="sorted" attr2="all" attr="I'm"
      xmlns:b="http://www.ietf.org"
      xmlns:a="http://www.w3.org"
      xmlns="http://example.org"/>
   <e6 xmlns="" xmlns:a="http://www.w3.org">
      <e7 xmlns="http://www.ietf.org">
         <e8 xmlns="" xmlns:a="http://www.w3.org">
            <e9 xmlns="" xmlns:a="http://www.ietf.org"/>
         </e8>
      </e7>
   </e6>
</doc> 
Example 8
<doc>
   <e1></e1>
   <e2></e2>
   <e3 id="elem3" name="elem3"></e3>
   <e4 id="elem4" name="elem4"></e4>
   <e5 xmlns="http://example.org" xmlns:a="http://www.w3.org" xmlns:b="http://www.ietf.org" attr="I'm" attr2="all" b:attr="sorted" a:attr="out"></e5>
   <e6>
      <e7 xmlns="http://www.ietf.org">
         <e8 xmlns="">
            <e9 attr="default"></e9>
         </e8>
      </e7>
   </e6>
</doc>
Example 9
<n0:doc xmlns:n0="">
   <n0:e1></n0:e1>
   <n0:e2></n0:e2>
   <n0:e3 id="elem3" name="elem3"></n0:e3>
   <n0:e4 id="elem4" name="elem4"></n0:e4>
   <n1:e5 xmlns:n1="http://example.org" xmlns:n2="http://www.ietf.org" xmlns:n3="http://www.w3.org" attr="I'm" attr2="all" n2:attr="sorted" n3:attr="out"></n1:e5>
   <n0:e6>
      <n2:e7 xmlns:n2="http://www.ietf.org">
         <n0:e8>
            <n0:e9 attr="default"></n0:e9>
         </n0:e8>
      </n2:e7>
   </n0:e6>
</n0:doc>
After c14n with Trim whitespace
Example 10
<doc><e1></e1><e2></e2><e3 id="elem3" name="elem3"></e3><e4 id="elem4" name="elem4"></e4><e5 xmlns="http://example.org" xmlns:a="http://www.w3.org" xmlns:b="http://www.ietf.org" attr="I'm" attr2="all" b:attr="sorted" a:attr="out"></e5><e6><e7 xmlns="http://www.ietf.org"><e8 xmlns=""><e9 attr="default"></e9></e8></e7></e6></doc>

2.4 Character Modifications and Character References

Demonstrates:

Note: The last element, normId, is well-formed but violates a validity constraint for attributes of type ID. For testing canonical XML implementations based on validating processors, remove the line containing this element from the input and canonical form. In general, XML consumers should be discouraged from using this feature of XML.

Note: Whitespace character references other than &#x20; are not affected by attribute value normalization [XML10].

Note: In the canonical form, the value of the attribute named attr in the element norm begins with a space, an apostrophe (single quote), then four spaces before the first character reference.

Note: The expr attribute of the second compute element contains no line breaks.

Original Data After c14n After c14n with Trim whitespace
Example 11
<!DOCTYPE doc [
<!ATTLIST normId id ID #IMPLIED>
<!ATTLIST normNames attr NMTOKENS #IMPLIED>
]>
<doc>
   <text>First line
Second line</text>
   <value>2</value>
   <compute><![CDATA[value>"0" && value<"10" ?"valid":"error"]]></compute>
   <compute expr='value>"0" && value<"10" ?"valid":"error"'>valid</compute>
   <norm attr=' '    
	   ' '/>
   <normNames attr='   A    
	   B   '/>
   <normId id=' ' 
	 ' '/>
</doc>
Example 12
<doc>
   <text>First line
Second line</text>
   <value>2</value>
   <compute>value>"0" && value<"10" ?"valid":"error"</compute>
   <compute expr="value>"0" && value<"10" ?"valid":"error"">valid</compute>
   <norm attr=" '    
	   ' "></norm>
   <normNames attr="A 
	 B"></normNames>
   <normId id="' 
	 '"></normId>
</doc>
Example 13
<doc><text>First line
Second line</text><value>2</value><compute>value>"0" && value<"10" ?"valid":"error"</compute><compute expr="value>"0" && value<"10" ?"valid":"error"">valid</compute><norm attr=" '    
	   ' "></norm><normNames attr="A 
	 B"></normNames><normId id="' 
	 '"></normId></doc>

2.5 Entity References

Demonstrates:
Original Data After c14n After c14n with Trim whitespace
Example 14
<!DOCTYPE doc [
<!ATTLIST doc attrExtEnt CDATA #IMPLIED>
<!ENTITY ent1 "Hello">
<!ENTITY ent2 SYSTEM "world.txt">
<!ENTITY entExt SYSTEM "earth.gif" NDATA gif>
<!NOTATION gif SYSTEM "viewgif.exe">
]>
<doc attrExtEnt="entExt">
   &ent1;, &ent2;!
</doc>

<!-- Let world.txt contain "world" (excluding the quotes) -->
Example 15
<doc attrExtEnt="entExt">
   Hello, world!
</doc>
Example 16
<doc attrExtEnt="entExt">Hello, world!</doc>

2.6 UTF-8 Encoding

Demonstrates:

Note: The content of the doc element are two octets whose hexadecimal values are C2 and A9, which is the UTF-8 encoding of the UCS codepoint for the copyright sign (©).

Original Data After c14n
Example 17
<?xml version="1.0" encoding="ISO-8859-1"?>
<doc>©</doc>
Example 18
<doc>©</doc>

3. Dealing with Namespaces

3.1 Namespaces declarations are pushed down

In this example there are three prefixes declarations: "a" "b" and "c". Of these "c" is not visibly utilized at all, so it does not appear in the canonicalized output. "b" is used but not by the top level "a:foo" element, but by each of its children, so canonicalization "pushes down" the declaration to where it is actually utilized. Note the three "b:bar" elements utilize the "b" prefix in the element name, whereas the last "a:bar" element utilizes that declaration not in the element name , but in the "b:att" attribute.
Original Data After c14n After c14n with PrefixRewrite
Example 19
<a:foo xmlns:a="http://a" xmlns:b="http://b" xmlns:c="http://c">
 <b:bar/>
 <b:bar/>
 <b:bar/>
 <a:bar b:att1="val"/>
</a:foo>
Example 20
<a:foo xmlns:a="http://a">
 <b:bar xmlns:b="http://b"></b:bar>
 <b:bar xmlns:b="http://b"></b:bar>
 <b:bar xmlns:b="http://b"></b:bar>
 <a:bar xmlns:b="http://b" b:att1="val"></a:bar>
</a:foo>
Example 21
<n0:foo xmlns:n0="http://a">
 <n1:bar xmlns:n1="http://b"></n1:bar>
 <n1:bar xmlns:n1="http://b"></n1:bar>
 <n1:bar xmlns:n1="http://b"></n1:bar>
 <n0:bar xmlns:n1="http://b" n1:att1="val"></n0:bar>
</n0:foo>

3.2 Default namespace declarations

In this example there are three prefixes declarations: "a", "b", and also a default prefix. Of these "a" is not visibly utilized at all, so it does not appear in the canonicalized output. The "foo" element uses the default prefix. Note however the "att2" attribute does NOT use the default prefix, it is simply in the scope of the "b:bar" element. With prefix rewriting both the default prefix and the "b" prefix are rewritten.
Original Data After c14n After c14n with PrefixRewrite
Example 22
<foo xmlns:a="http://a" xmlns:b="http://b">
 <b:bar b:att1="val" att2="val"/>
</foo>
Example 23
<foo>
 <b:bar xmlns:b="http://b" att2="val" b:att1="val"></b:bar>
</foo>
Example 24
<n0:foo xmlns:n0="">
 <n1:bar xmlns:n1="http://b" att2="val" n1:att1="val"></n1:bar>
</n0:foo>

3.3 Sorting namespace declarations

In this example there are four prefixes declarations: "a", "b", "c" and "d". They map to namespace URIs "http://z3", "http://z2", "http://z1" and "http://z0" respectively.

Notice the following things in the default canonicalization ("After c14n"):

And notice these things in canonicalization with prefix rewriting ("After c14n with PrefixRewrite"):
Original Data After c14n After c14n with PrefixRewrite
Example 25
<a:foo xmlns:a="http://z3" xmlns:b="http://z2" b:att1="val1" c:att3="val3" b:att2="val2" xmlns:c="http://z1" xmlns:d="http://z0">
 <c:bar/>
 <c:bar d:att3="val3"/>
</a:foo>
Example 26
<a:foo xmlns:a="http://z3" xmlns:b="http://z2" xmlns:c="http://z1" c:att3="val3" b:att1="val1" b:att2="val2">
 <c:bar></c:bar>
 <c:bar xmlns:d="http://z0" d:att3="val3"></c:bar>
</a:foo>
Example 27
<n2:foo xmlns:n0="http://z1" xmlns:n1="http://z2" xmlns:n2="http://z3" n0:att3="val3" n1:att1="val1" n1:att2="val2">
 <n0:bar></n0:bar>
 <n0:bar xmlns:n3="http://z0" n3:att3="val3"></n0:bar>
</n2:foo>

3.4 Namespace Re-declarations

In this example there are three prefixes "a", "b" and the default prefix. The "foo" element defines them to be "http://z3", "http://z2" and "" respectively. But the "bar" redeclares these prefixes to "http://z2", "http://z3" abd "http://z0" respectively.

Notice the following things in the default canonicalization ("After c14n"):

And notice these things in canonicalization with prefix rewriting ("After c14n with PrefixRewrite"):

Original Data After c14n After c14n with PrefixRewrite
Example 28
<foo xmlns:a="http://z3" xmlns:b="http://z2" a:att1="val1" b:att2="val2"> 
 <bar xmlns="http://z0" xmlns:a="http://z2" a:att1="val1" b:att2="val2" xmlns:b="http://z3" />
</foo>
Example 29
<foo xmlns:a="http://z3" xmlns:b="http://z2" b:att2="val2" a:att1="val1"> 
 <bar xmlns="http://z0" xmlns:a="http://z2" xmlns:b="http://z3" a:att1="val1" b:att2="val2"></bar>
</foo>
Example 30
<n0:foo xmlns:n0="" xmlns:n1="http://z2" xmlns:n2="http://z3" n1:att2="val2" n2:att1="val1"> 
 <n3:bar xmlns:n3="http://z0" n1:att1="val1" n2:att2="val2"></n3:bar>
</n0:foo>

3.5 Superfluous Namespace declarations

In this example there are five prefixes "a", "b", "c", "d" and the default prefix and they are all declared to the same namespace URI "http://z0". The "a" prefix is defined twice, one in the "foo" element, and then again in "c:bar" element; obviously the definition of "a" in "c:bar" is unnecessary.

Notice the following things in the default canonicalization ("After c14n"):

And notice these things in canonicalization with prefix rewriting ("After c14n with PrefixRewrite"):
Original Data After c14n After c14n with PrefixRewrite
Example 31
<foo xmlns:a="http://z0" xmlns:b="http://z0" a:att1="val1" b:att2="val2" xmlns="http://z0"> 
 <c:bar xmlns:a="http://z0" xmlns:c="http://z0" c:att3="val3"/>
 <d:bar xmlns:d="http://z0"/>
</foo>
Example 32
<foo xmlns="http://z0" xmlns:a="http://z0" xmlns:b="http://z0" a:att1="val1" b:att2="val2"> 
 <c:bar xmlns:c="http://z0" c:att3="val3"></c:bar>
 <d:bar xmlns:d="http://z0"></d:bar>
</foo>
Example 33
<n0:foo xmlns:n0="http://z0" n0:att1="val1" n0:att2="val2"> 
 <n0:bar n0:att3="val3"></n0:bar>
 <n0:bar></n0:bar>
</n0:foo>

3.6 Special namespaces "xml", "xsi", "xsd"

In this example there are are three special namespace declaration the "xml" namespace used in the attribute xml:id="23" and also the "xsi" and "xsd" namespaces used in xsi:type="xsd:string".

Canonicalization only treats "xml" as a special namespace. It is never rewritten by prefix-rewriting. "xsi" and "xsd" are treated as regular namespaces.

Notice the following things in the default canonicalization ("After c14n"):

Notice these things in canonicalization with prefix rewriting ("After c14n with PrefixRewrite"): Notice these things in canonicalization with QName awareness ("After c14n with QNameAware"): Notice these things in canonicalization with QName awareness ("After c14n with QNameAware and PrefixRewrite"):
Original Data After c14n After c14n with PrefixRewrite
Example 34
<foo xmlns="http://z0" xml:id="23">
  <bar xsi:type="xsd:string" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">data</bar>
</foo>
Example 35
<foo xmlns="http://z0" xml:id="23">
  <bar xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xsd:string">data</bar>
</foo>
Example 36
<n0:foo xmlns:n0="http://z0" xml:id="23">
  <n0:bar xmlns:n1="http://www.w3.org/2001/XMLSchema-instance" n1:type="xsd:string">data</n0:bar>
</n0:foo>
After c14n with QNameAware After c14n with QNameAware and PrefixRewrite
Example 37
<foo xmlns="http://z0" xml:id="23">
  <bar xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xsd:string">data</bar>
</foo>
Example 38
<n0:foo xmlns:n0="http://z0" xml:id="23">
  <n0:bar xmlns:n1="http://www.w3.org/2001/XMLSchema" xmlns:n2="http://www.w3.org/2001/XMLSchema-instance" n2:type="n1:string">data</n0:bar>
</n0:foo>

3.7 Prefixes in Element content

Notice the following things in the default canonicalization ("After c14n"):

Notice the following things in the default canonicalization ("After c14n with QNameAware <a:bar>"):

Notice the following things in the default canonicalization ("After c14n with QNameAware <a:bar> and <dsig2:IncludedXPath>"):

Notice the following things in the default canonicalization ("After c14n with QNameAware <a:bar> and <dsig2:IncludedXPath> and PrefixRewrite"):

Original Data After c14n After c14n with QNameAware <a:bar>
Example 39
<a:foo xmlns:a="http://a" xmlns:b="http://b" xmlns:child="http://c" xmlns:soap-env="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
 <a:bar>xsd:string</a:bar>
 <dsig2:IncludedXPath xmlns:dsig2="http://www.w3.org/2010/xmldsig2#">/soap-env:body/child::b:foo[@att1 != "c:val" and @att2 != 'xsd:string']</dsig2:IncludedXPath>
</a:foo>
Example 40
<a:foo xmlns:a="http://a">
 <a:bar>xsd:string</a:bar>
 <dsig2:IncludedXPath xmlns:dsig2="http://www.w3.org/2010/xmldsig2#">/soap-env:body/child::b:foo[@att1 != "c:val" and @att2 != 'xsd:string']</dsig2:IncludedXPath>
</a:foo>
Example 41
<a:foo xmlns:a="http://a">
 <a:bar xmlns:xsd="http://www.w3.org/2001/XMLSchema">xsd:string</a:bar>
 <dsig2:IncludedXPath xmlns:dsig2="http://www.w3.org/2010/xmldsig2#">/soap-env:body/child::b:foo[@att1 != "c:val" and @att2 != 'xsd:string']</dsig2:IncludedXPath>
</a:foo>
After c14n with QNameAware <a:bar> and <dsig2:IncludedXPath> After c14n with QNameAware <a:bar> and <dsig2:IncludedXPath> and PrefixRewrite
Example 42
<a:foo xmlns:a="http://a">
 <a:bar xmlns:xsd="http://www.w3.org/2001/XMLSchema">xsd:string</a:bar>
 <dsig2:IncludedXPath xmlns:b="http://b" xmlns:dsig2="http://www.w3.org/2010/xmldsig2#" xmlns:soap-env="http://schemas.xmlsoap.org/wsdl/soap/">/soap-env:body/child::b:foo[@att1 != "c:val" and @att2 != 'xsd:string']</dsig2:IncludedXPath>
</a:foo>
Example 43
<n0:foo xmlns:n0="http://a">
 <n0:bar xmlns:n1="http://www.w3.org/2001/XMLSchema">n1:string</n0:bar>
 <n4:IncludedXPath xmlns:n2="http://b" xmlns:n3="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:n4="http://www.w3.org/2010/xmldsig2#">/n3:body/child::n2:foo[@att1 != "c:val" and @att2 != 'xsd:string']</n4:IncludedXPath>
</n0:foo>

A. References

Dated references below are to the latest known or appropriate edition of the referenced work. The referenced works may be subject to revision, and conformant implementations may follow, and are encouraged to investigate the appropriateness of following, some or all more recent editions or replacements of the works cited. It is in each case implementation-defined which editions are supported.

A.1 Normative references

[XML-C14N20]
John Boyer; Glen Marcy; Pratik Datta; Frederick Hirsch. Canonical XML Version 2.0. 11 April 2013. W3C Working Group Note. URL: http://www.w3.org/TR/xml-c14n2/
[XML10]
C. M. Sperberg-McQueen et al. Extensible Markup Language (XML) 1.0 (Fifth Edition). 26 November 2008. W3C Recommendation. URL: http://www.w3.org/TR/xml/