Copyright © 1999 The Internet Society & W3C (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This is an immature WG XML Signature Scenarios WG draft. At some point, a future version may be advanced as W3C Note/IETF Informational RFC. It is based on the submissions of the authors.
Please send comments to the editor <firstname.lastname@example.org> and cc: the list <email@example.com>. Publication as a Working Draft does not imply endorsement by the W3C membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite W3C Drafts as other than "work in progress". A list of current W3C working drafts can be found at http://www.w3.org/TR.
This document provides scenarios that exemplify specific signature problems that should be addressed by the XML Digital Signature specification.
The XML 1.0 Recommendation [XML] describes the syntax of a class of data objects called XML documents. The mission of the XML DSig working group is to develop a XML syntax used for representing signatures on digital content and procedures for computing and verifying such signatures. Signatures will provide data integrity, authentication, and/or non-repudiatability.
The purpose of this document is to help inform the specification of XML signatures by providing specific examples of signature problems that should be considered and, whenever possible, solved by the XML Signature specification.
Each example begins with text describing the problem. Often an example containing XML markup will be given, and the text will describe how the markup or some portion of it should be signed. Some examples will also use the syntax described in the Brown (990618) draft to show how such a signature might be expressed. These examples will be changed to test/apply the syntax developed by the WG. Naturally, in this early stage, such examples may leave out portions of the signature element defined by Brown if they have no bearing on the example.
The problem is to digitally sign an arbitrary piece of information provided that such information can be enveloped by markup representing an XML signature.
The piece of information could be an entire XML document, an entire document in another (possibly binary) format, or any possible byte stream for which a signature is required. Since the information is being enveloped (and may not even be in XML format), the markup representing the XML signature must provide a root XML document element.
It is assumed that the application is responsible for placing the data into the correct location within the markup representing the XML signature.
The current XML DSig requirements specify the use of a manifest method for generating digital signatures, and the Brown draft syntax is appropriate (at this time) for expressing the WG's expectations.
A Document element is used as the root element, and the Package element is used to envelope the arbitrary piece of information. The actual content is stored in the Value subelement of the Package, which allows either Base 64 encoding or no encoding (except for the normal XML use of entity references, CDATA and so forth). The example below shows Base 64 encoding on a hypothetical binary document.
The Package element described above is indicated by a Locator element in a Resource element in the Manifest. The Resource element also contains a Digest subelement, which in turn contains an Algorithm and a Value subelement.
It is assumed that before signature value generation is requested, the application will have obtained the piece of information and placed it into the Package element's Value, computed the digital fingerprint (hash) of the Package element using the Digest Algorithm, and placed the result in the Digest Value.
It is unclear at this time whether the following will be core signature behavior or application behavior. The signer's identity must be associated with the document hash in some way. This can be done by placing the unique identifying information for the signer in the KeyingInfo element (which is and must be inside the Manifest to secure the association between signer and content). In this example, the information is a PKI certificate.
Core behavior for generating a signature will include hashing a canonical version of the manifest, signing that with the signer's private key, and placing the resulting encrypted hash into the Signature element's Value subelement. Core behavior for verification includes checking the encrypted hash against a newly computed hash of the Manifest. Application behavior is required to check whether the package contents match the fingerprint (hash) stored by the application in the Digest Value of the associated Resource.
There are a number of concerns with the package method of envelopment. One is the fact that, in this scenario, the true intent is to sign the enveloped content yet the core signature behavior does not validate the enveloped content. Another concern is that the extra level of indirection adds to the size of the signed result, which may perceptibly decrease efficiency when signing very small pieces of data such as protocol messages.
One suggested method is to permit the XML elements that represent a signature to directly contain the message to be signed. Again using a syntax like Brown's, this would imply simply allowing a Resource element to have the option of directly containing the Package rather than a Locator and Digest. Thus, when the Manifest is canonicalized, it would contain the package contents such that the encrypted hash stored in the Signature Value would directly sign the enveloped information.
Note that the specific syntax of this example is irrelevant. The relevant point of this example is to indicate a need envelope some data with any XML syntax representing a signature as follows:
[Start of any XML signature syntax]
[End of any XML signature syntax]
such that the associated semantic is that the encrypted hash stored in the XML signature syntax was computed over a block of data that included the Value element. Furthermore, the use of the dsig namespace and the keywords Value and encoding are also irrelevant. However, the ability to specify an attribute that indicates either no encoding or base-64 encoding of the information is relevant.
Another example of this is suggested by an upcoming protocol named IMPP (Internet Messaging and Presence Protocol). IMPP messages have the property that their human-readable content is typically quite short, so signature-block size has a distinct impact on the overall bandwidth used by an IM system. Were we to do create signature markup using the indirection of a manifest, the manifest itself could be about as large as the IMPP message! The direct method reduces the overall size, and the size can be further reduced by not even putting the content in the Resource element. The example below still places the IMPP message in the Manifest since the Manifest contains that which must be hashed and signed. Furthermore, an application could still use a Value element directly in the Manifest, esp. if base-64 encoding is required. The reason is that the XML signature will be applied to anything in the manifest.
This example has one user (impp://microsoft.com/bal) asking to be notified whenever the status of another user (impp://citibank.com/dsolo) changes.
The problem is to digitally sign an XML document where it is expected that the element(s) representing the signature will be embedded within the XML document (rather than having the document enveloped inside the signature as in the previous scenario).
With the indirection of the manifest method, this case is almost the same as the Enveloped Internal Content example. The Resource Locator element should simply identify the root element of the document containing the signature element(s).
The application is assumed to produce a digital fingerprint of each Resource in the manifest. In this case, the fingerprint would be of the entire XML document containing the signature element(s).
The ultimate intent of this scenario is to sign the resource, not the manifest. It should be disturbing that an XML signature whose core validation succeeds can appear on an XML document that has been changed. There is WG consensus that there are scenarios where it is appropriate and efficient to validate the manifest but not the resource. However, there are also (probably just as many) scenarios where a change to a resource should imply that the signature is invalid, esp. when that resource is in within the same document as the signature elements.
It should be possible to create a syntax that permits both scenarios. Rather than requiring the application to supply a Digest Value for each element of the manifest the core signature behavior could be changed such that the omission of the Digest element from the Resource implies that the Resource contents should be appended to the canonicalized Manifest. The result would be that the encrypted hash would directly sign such resources.
The problem is to use an XML signature to sign Web resources that are external to the XML document. These resources may or may not be in XML format.
The current WG strategy is to indicate such objects using an XLink. For example, in the Brown draft syntax, the content of a Locator element within a Resource element in the Manifest could indicate a resource external to the XML document as easily as it can indicate internal resources.
The WG considers it to be the application's responsibility to provide the digital fingerprint of the external resource, and validation of the resource's finger print is also viewed as the application's responsibility and not part of the core signature validation. This choice appears to be more appropriate to external resources than to the internal resources of previous scenario the for two reasons. First, an external resource may not even be in XML. Second, an external resource may not be accessible at validation time.
The problem is to include within a signature (either directly or indirectly) a portion of an XML document rather than the whole XML document. XML-based standards for indicating a portion of an XML document are currently underdevelopment, and there is WG consensus to use these methods.
This scenario appears to be solvable in part using simple variations allowed by an XLink in a Resource Locator. The subsections below provide elaboration of some of the difficulties with the basic method.
One reason for having the ability to sign parts of an XML document is to facilitate the creation of multiple signatures. The intent in this section is to exclude application-specific information such as the intent of the signer but instead focus on how different types of signatures might imply different requirements on what portion of the document is signed.
One example is the need for a co-signer. In this example, the original signer's signature could be represented by one signature element, and the co-signer's signature could be represented by a second signature element. It would appear that this problem is solvable by one of the following two methods:
It is necessary to omit the co-signer signature element from the signer signature since the co-signer signature will be added after the signer signature. If the signer signature envelopes its content, then no further effort is required. If the signer signature is unenveloped direct, then its manifest must cause the omission of the co-signer signature element. If the signer signature is unenveloped indirect then the manifest must not include any ancestor of the co-signer signature element (this restriction may be lifted if the WG adopts some method of excluding subelements from an included resource).
A second example of the need for multiple signers is Mutual Agreement. In this case, both signers sign an agreement, but not each others' signatures. Implied in this is the ability to have the signatures executed independently and merged into a single file at some later time without disturbing the signatures. Clearly the use of locators is preferrable since only one of the signatures could use direct envelopment.
A third example occurs when different signers must sign differing portions of the same document. It is a common experience to find a form with an 'Office Use Only' section. In the digital version of such a form, a portion must be filled out and signed by a party initiating a transaction, the 'Office Use Only' section is then filled out and signed by a second party, and then the whole document is eventually signed by an approving party.
The easiest solution appears to be the use of unenveloped signatures. Each signer's manifest can include only those elements that are relevant to the signer (except see the subsections on Document Closure and Preserving Non-continuous Element Order).
One serious problem with the use of manifest Locators as currently being considered by the WG is that the manifest performs exclusion implicitly. Resources can be included by explicit reference, but there is currently no way of saying "Sign everything in this resource except for A, B, C, ...".
Although it is possible to add more elements to an XML document without breaking the XML signatures in that document, it is also true that the signatures clearly list what has actually been signed. Thus, it could be argued that the notion of document closure is an application issue. However, the need for document closure is closely linked to the need to sign portions of an XML document. If a document closure method were to become a co-requirement of partial document signing, then the result would be that XML compliant signature verification could occur without the knowledge of specific application requirements.
For example, a workflow engine that validates signatures could detect an XML document closure error without having to maintain a table specifying whether or not document closure is required for each known document type.
Here are some suggestions for offering this feature that have come up:
The XML language does not forbid an element from deriving meaning from the attributes of its ancestors (for example, the xmlns namespace attribute), from the tags of its ancestors (which at a level of abstraction can be viewed as just another attribute), and even from the element depth at which it appears. This is because XML does not define the words of a markup language, so it cannot define their interoperation.
A specification for signing XML should account for the possibilities inherent in XML rather than only offering full security for XML documents that meet certain criteria. If this is accepted as axiomatic, then it follows that an XML signature should permit the ability to capture information carried by ancestor elements. This is not to say that such information is always required. There are XML extensions languages which have been designed to be self-contained. However, this design cannot be guaranteed.
For a specific example, suppose we have a hypothetical declarative computation language in which each computed XML element derives a current value tagged by cval based on evaluation of a mathematical expression contained in a subelement called compute. Now suppose that the author of such a document wishes to digitally sign the computations to guarantee to users that the interpreting application of the XML document will behave as the author intended. To do this, the cval elements must be omitted from the signature so that the computations can be operated in the interpreting application without breaking the author's signature.
Note that this example places no semantics on the elements containing the computations. The elements are vertices in a directed acyclic graph of computational dependencies established implicitly by the compute elements. Efforts to move the data managed by this system into, say, a presentation layer are not at issue, though it is conceivable that subsets of these vertex elements could have semantics of providing views for various applications such as a GUI, a database system, or a workflow engine.
The problem with this signature is that the compute elements being signed are no longer associated with the elements for which they provide computations. The direct ancestor of a compute provides important information about the compute element. It is possible to move the compute elements to different vertices in the graph without breaking the signature.
Finally, note that this example is simplified by the fact that the same subelement is desired in every element. The fact that XML is grammar with almost no lexicon means it is conceivable that different subelements could be signed within each element. Further, it is possible to construct examples in which the set of desired subelements is governed by attributes or tag names of the ancestor element.
Here are some suggestions for offering this feature that have come up:
Suppose we have an XML signature that signs three (or more) elements that are separated by elements which are not signed. The current plan is to indicate each element being signed using a resource locator in the manifest. The problem with this approach is that the signature does not capture the order in which the signed elements appear. XML does not forbid extension languages from deriving information based on physical position within the resource relative to other elements. Furthermore, XML does not forbid the interspersion of elements from differing namespaces, so the ability to sign only those elements in a given namespace may be hindered if we cannot capture the order in which those elements appear.
Since it is considered to be core functionality to provide applications with a way of signing partial documents, a natural result is the ability to implicitly omit elements that should not or need not be in a particular signature (some reasons for this are given in this section). Therefore, it should be core behavior to provide a way to capture element order even when some elements are being omitted between any two elements being signed.
Here is an example in which three signed elements have an arbitrary number of interceding unsigned elements. Elements A, C and E are signed (in this case because they are from the same namespace), but they could be moved into any relative order without disturbing the signature. In terms of the security of a hashing algorithm, if changing the order of substrings did not result in a different hash, then a hash algorithm would be considered broken. XML signatures should be able to offer this same level of security.
Here are some suggestions for offering this feature that have come up:
At this time, the only solution that seems to cover all of the problems is some version of element filtering (see for example Boyer, XFDL) such that exclusion can be explicitly stated. The idea to put this information in the c14n algorithm parameters appeared in a letter by Brown.
An example of this appears in the Brown draft. The IOTP example could be good here. This section is in search of an author.
The editor requests the reader to consider further scenarios. It would be helpful if the prospective author would first consider whether the scenario fits into one of the scenarios currently in this document. If not, then a submission is certainly warranted. However, submissions that match current scenarios are also solicited if they contain better examples than the rather generic ones appearing in some of the sections. Note that the editor(s) may use parts of your scenario to improve a section or sections, but in any case you will be credited in the list of authors if your work is used.
Feedback to the WG is also sought on the issues listed in some of the scenarios. Once the WG has decided how to resolve an issue, it will may be deleted, be the subject of further elaboration within the scenario, or become a new scenario.