W3C | XML Signature and XML Encryption

Refactoring XML Signature and XML Encryption

Joseph Reagle <reagle@w3.org>
$Revision: 1.2 $ on $Date: 2002/10/30 23:19:00 $ GMT


This document describes some possible areas and issues related to future work on XML Signature (XDSIG) and XML Encryption (XENC) — jointly (XSE) — by capturing common questions and their (sometimes incomplete) answers. This is not intended to be a introduction, tutorial, or FAQ about these specifications; instead, it is a fairly technical discussion of XML architecture and dependencies.

Status of this document

This is a personal draft, with no formal standing, soliciting discussion and comment. I'm sure there are errors and confusion which readers are encouraged to note and send to the author.

Table of Contents

  1. Introduction
  2. Infoset
    1. Is XDSIG based on XML Infoset?
    2. Can XDSIG they work with Infoset applications?
    3. Can XENC applications encrypt Infoset Items?
  3. Schema
    1. Can XDSIG work with XML Schema applications?
    2. How does XENC affect an instance's validity?
  4. XML 1.1 and Namespaces 1.1
    1. Will XSEC applications work with XML 1.1 and Namespaces 1.1?

1 Introduction

It is expected that all specifications chartered under the XSE Activities will be completed by December 2002. Those specifications are firmly rooted in the XML 1.0, Namespaces 1.0, and XPath 1.0 Recommendations. Given the maturation of the XML Infoset 1.0 and XML Schema 1.0 specifications there is now interest for the XSE technologies to be more closely integrated with these new specifications. This document captures questions, answers, and issues for further work related to this migration.

2 Infoset

2.1 Is XDSIG based on XML Infoset?

No. XDSIG uses the XPath 1.0 data model to abstractly represent a parsed XML instance. This model is used within the XDSIG Transform Processing to select and pass XML instances and/or fragments between transforms. For instance, an XPath element node might be selected, transformed, and then serialized. While the XPath 1.0 and XML Infoset 1.0 data models are similar, there are crucial differences. Since the XDSIG Transform Processing is dependent on both XPath's data model and node processing (i.e. selection) it would be best for future versions to use a version of XPath based on the Infoset. XPath 2.0 is based on Infoset but also requires additional "support for XML Schema types" and for representing "collections of documents and of complex values." These features support XML Query 1.0 and and XSLT 2.0 requirements and may also be beneficial to some XSE applications, though they might also be unnecessary and costly to others.

2.2. Can XDSIG they work with Infoset applications?

Yes. Most Infoset applications are still XML 1.0 applications: they are able to serialize and parse an infoset. Consequently, an application that processes an XML document as information items need only define a mapping between its model and the XPath 1.0 model, or simply serialize and then permit the XML Signature application to parse the serialized instance into an XPath node-set. During this translation or serialization/parsing some information might be lost (given the differences alluded to above) that may, or may not be, of consequence to the application.

2.3 Can XENC applications encrypt Infoset Items?

Partially, and once serialized. Encryption algorithms require octets (or bits) as inputs to their cryptographic algorithms. XENC provides a framework for encrypting different types of information via its Type mechanism, and presently provides two specific identifiers and processing rules for the XML 1.0 "element" and "element content" character productions as octets. These are, of course, XML 1.0 serializations of parts of an "Element Information Item" and its "[children]". However, as described in XENC's Section 4.3.3 Serializing XML and shown in an email from Ross Thompson on behalf of the Schema WG this can lead to problems such as the loss of infoset information such as the binding between a namespace prefix and its URI.

Other types and their processing could be defined. For instance, a specification could provide an identifier for a complete serialized representation of an infoset item, a pickled python DOM node, or compressed data. These specifications would also have to define the encryption, decryption, and replacement processing: how an object (e.g., infoset item) would be removed, encrypted, decrypted, and reinserted into another infoset and what "fix-ups" would have to be done (e.g., XInclude's adjustment of a IDREFs into their new resulting infoset).

Potential Requirement: to work in the context of Infoset only applications and where complete infoset items are encrypted one REQUIRES a serialization for infoset items that is non-ambiguous and non-lossy.

3 Schema

3.1 Can XDSIG work with XML Schema applications?

Yes, with some caveats. XML Schema 1.0 does two things. First, it validates an instance according to a schema, this has no affect on XDSIG. Second, it also might modify the infoset resulting from parsing and validating the instance. For example, "Schemas may also provide for the specification of additional document information, such as normalization and defaulting of attribute and element values." These changes are typically reflected in the "augmented infoset" as a "[schema normalized value]". In a naive mapping to the XPath data model, these changes would not be retained. A more careful mapping or serialization/re-parsing is more likely to retain this information. Regardless, XDSIG's Transform extensibility permits careful treatment of this issue as shown by the UDDI Schema Centric Canonicalization. However, for XDSIG to adopt and support schema augmented infosets as its native data model (and processing/transforms) it would have to be modified to rely upon schema aware XPath 2.0 and related specifications.

3.2 How does XENC affect an instance's validity?

As reflected in the XENC Requirements, XENC does not concern itself with maintaining the validity of an instance when it encrypts parts of it:

  1. ...
  2. XML Instance Validity {WS}
    1. Encrypted instances must be well-formed but need not be valid against their original definition (i.e. applications that encrypt the element structure are purposefully hiding that structure.)
    2. Instance authors that want to validate encrypted instances must do one of the following:
      1. Write the original schema so as to validate resulting instances given the change in its structure and inclusion of element types from the XML Encryption namespace.
      2. Provide a post-encryption schema for validating encrypted instances.
      3. Provide information on how to restore the document to its original state via application context (e.g., headers). {List: Reagle}


Ross Thompson on behalf of the Schema WG reiterates some of these options and provides some examples of how a schema might be written to accommodate them.

4 XML 1.1 and Namespaces 1.1

4.1Will XSEC applications work with XML 1.1 and Namespaces 1.1?

No. The XSEC specifications are dependent on the {XML, NS, XPath} 1.0 set of specifications. For XSEC to be migrated, aside updating the normative references, one might have to make some minor, but substantive, changes to the specifications and their implementations. Furthermore, the relationship between the {XML Infoset 1.0, XML Schema, XPath 2.0} set of specifications and {XML 1.1, NS 1.1} is unclear.