W3C

SOAP Messages with Attachments

W3C Note 11 December 2000

This version:
http://www.w3.org/TR/2000/NOTE-SOAP-attachments-20001211
Latest version:
http://www.w3.org/TR/SOAP-attachments
Authors:
John J. Barton, Hewlett Packard Labs
Satish Thatte, Microsoft
Henrik Frystyk Nielsen, Microsoft

Abstract

This document defines a binding for a SOAP 1.1 message to be carried within a MIME multipart/related message in such a way that the processing rules for the SOAP 1.1 message are preserved. The MIME multipart mechanism for encapsulation of compound documents can be used to bundle entities related to the SOAP 1.1 message such as attachments. Rules for the usage of URI references to refer to entities bundled within the MIME package are specified.

Status of this document

This document is a submission to the World Wide Web Consortium (see Submission Request, W3C Staff Comment) as a suggestion for message packaging for the W3C XML Activity on XML Protocols. For a full list of all acknowledged Submissions, please see Acknowledged Submissions to W3C.

Comments are welcome to the authors but you are encouraged to share your views on the W3C's public mailing list mailto:xml-dist-app@w3.org (see archives).

This document is a NOTE made available by the W3C for discussion only. Publication of this Note by W3C indicates no endorsement by W3C or the W3C Team, or any W3C Members. W3C has had no editorial control over the preparation of this Note. This document is a work in progress and may be updated, replaced, or rendered obsolete by other documents at any time.

A list of current W3C technical documents can be found at the Technical Reports page.

Table of contents

  1. Introduction
  2. SOAP Message Packages
  3. SOAP References to Attachments
  4. Relationship to SOAP 1.1
  5. HTTP Binding
  6. References

1. Introduction

A SOAP message may need to be transmitted together with attachments of various sorts, ranging from facsimile images of legal documents to engineering drawings. Such data are often in some binary format. For example, most images on the Internet are transmitted using either GIF or JPEG data formats. In this document we describe a standard way to associate a SOAP message with one or more attachments in their native format in a multipart MIME structure for transport. The specification combines specific usage of the Multipart/Related MIME media type (RFC 2387) and the URI schemes discussed in RFC 2111 and RFC2557 for referencing MIME parts.

The methods described here treat the multipart MIME structure as essentially a part of the transfer protocol binding, i.e., on par with the transfer protocol headers as far as the SOAP message is concerned. The multipart structure, though given a name (SOAP message package) is not an entity that can be unambiguously identified as such because there is no token explicitly expressing the intent to make it such an entity. A conscious choice in this document was to avoid adding a new entity type based on a recognizable token. The purpose of this document is to show how to use existing facilities in SOAP and standard MIME mechanisms to carry and reference attachments. In other words, we take a minimalist approach to show what is already possible with existing standards without inventing anything. More rigorous semantics for message packages requires a new entity type. Such a type can be built by extending the approach described here with a new SOAP header entry which, for instance, may be used to provide a manifest of the complete contents of the message package.

Most Internet communication protocols are capable of transporting MIME encoded content, although some special considerations are required for HTTP as described in the HTTP binding section.

2. SOAP Message Packages

A "SOAP message package" contains a primary SOAP 1.1 message. It may also contain additional entities that are not lexically within the SOAP message but are related in some manner. These entities may contain data in formats other than XML. The primary SOAP 1.1 message in a message package may reference the additional entities. Such additional entities are often informally referred to as "attachments." This section describes how to construct SOAP message packages and how SOAP processors will process them.

A SOAP message package is constructed using the Multipart/Related media type, which is defined in RFC 2387. The rules for the construction of SOAP message packages are as follows:

  1. The primary SOAP 1.1 message must be carried in the root body part of the Multipart/Related structure. Consequently the type parameter of the Multipart/Related media header will always equal the Content-Type header for the primary SOAP 1.1 message, i.e., text/xml.
  2. Referenced MIME parts must contain either a Content-ID MIME header structured in accordance with RFC 2045, or a Content-Location MIME header structured in accordance with RFC 2557.

It is strongly recommended that the root part contain a Content-ID MIME header structured in accordance with RFC 2045, and that in addition to the required parameters for the Multipart/Related media type, the start parameter (optional in RFC 2387) always be present. This permits more robust error detection.

A SOAP processor compliant with this specification that receives a SOAP 1.1 message carried in the root body part of a Multipart/Related MIME message must process the SOAP message according to the rules for processing SOAP 1.1 messages as defined by SOAP 1.1. In particular, a SOAP processor that receives an invalid message must generate a Client fault code as described in SOAP 1.1, section 4.4.1.

The MIME Multipart/Related encapsulation of a SOAP message is semantically equivalent to a SOAP protocol binding in that the SOAP message itself is not aware that it is being encapsulated. That is, there is nothing in the primary SOAP message proper that indicates that the SOAP message is encapsulated (see section 5).

The following example shows a SOAP 1.1 message with an attached facsimile image of the signed claim form (claim061400a.tiff):

MIME-Version: 1.0
Content-Type: Multipart/Related; boundary=MIME_boundary; type=text/xml;
        start="<claim061400a.xml@claiming-it.com>"
Content-Description: This is the optional message description.

--MIME_boundary
Content-Type: text/xml; charset=UTF-8
Content-Transfer-Encoding: 8bit
Content-ID: <claim061400a.xml@claiming-it.com>

<?xml version='1.0' ?>
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
..
<theSignedForm href="cid:claim061400a.tiff@claiming-it.com"/>
..
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

--MIME_boundary
Content-Type: image/tiff
Content-Transfer-Encoding: binary
Content-ID: <claim061400a.tiff@claiming-it.com>

...binary TIFF image...
--MIME_boundary--

(In these examples the "Content-Type" header line has been continued across two lines so the example prints easily. SOAP message senders should send headers on a single long line.)

3. SOAP References to Attachments

Both the header entries and body of the primary SOAP 1.1 message may need to refer to other entities in the message package. In this section we specify a method to accomplish this using existing mechanisms in SOAP and MIME

The data encoding rules given in section 5 of SOAP 1.1 allow the value of an accessor to be given by reference, i.e., as a resource referenced by a URI given as the value of an href attribute. We observe that the SOAP encoding schema allows the value of an href attribute to be any URI reference, and the attribute may therefore be used to reference not just XML fragments within a SOAP 1.1 message, but any resource whatsoever.

This specification describes a usage pattern of the SOAP href attribute in SOAP 1.1 to allow attribute values to be references to attachments carried as MIME parts in the SOAP message package. The resolution process for URI references (including references used in href attributes) in the primary SOAP 1.1 message in a SOAP message package is based on the rules specified in RFC2557 for multipart MIME messages with text/html root documents. We adapt these rules from the HTML and rendering context and apply them to the SOAP 1.1 messaging context. In addition, we base the relative URI syntax and absolutization rules on RFC2396 rather than on the now obsolete RFC1808 used in RFC2557.

The resolution process operates in two steps: first convert all URI references to absolute references, and then resolve the absolute references. We provide rules for both steps here. Note that this process does not apply to same-document references as defined in section 4.2 of RFC 2396. The semantics of the SOAP 1.1 pattern that involves using an href attribute with a fragment identifier to reference an XML element in the same SOAP 1.1 message based on a label defined by an ID attribute remains unchanged.

The authoritative process for converting relative URI references to absolute references is defined in RFC 2396. The aspect of this process we need to specify relates to the establishment of the base URI. RFC 2396 specifies a process skeleton for establishing a base URI, based on the following options, listed in order of precedence. Next to each option we describe its application in the context of the SOAP message package format described in the last section.

  1. Base URI within Document Content: the mechanism for explicit specification of a base URI within a SOAP 1.1 message will be the XML base mechanism.
  2. Base URI from an Encapsulating Entity: If there is a Content-Location header containing an absolute URI in any MIME entity enclosing the primary SOAP 1.1 message, then the URI from the closest such Content-Location header is the base URI for the entity.
  3. Base URI from the Retrieval URI: the retrieval URI for a SOAP message package is never allowed to be used as a base URI.
  4. Default Base URI: the default base URI will be "thismessage:/" in accordance with RFC 2557.

Every MIME part in the Multipart/Related structure that constitutes a SOAP message package has at least one absolute URI label. There are three cases.

  1. If a Content-Location header is present with an absolute URI value then that URI is a label for the part.
  2. If a Content-Location header is present with a relative URI value then rules 2 and 4 above are applied to establish the base URI for the process of converting the relative URI to an absolute one. The resulting absolute URI is a label for the part.
  3. If a Content-ID header is present, then an absolute URI label for the part is formed using the CID URI scheme as described in RFC 2111.

Resolution of absolute URI references works as follows. For each referencing URI in the primary SOAP 1.1 message, compare the value of the referencing URI, after conversion to absolute form as described above, with the URI labels derived from Content-ID and Content-Location headers for other body parts in the surrounding Multipart/Related structure. The rules for URI comparison are given in RFC2396. If a match is found, the entity contained in the MIME part is the referant. If no match is found, use normal resolution rules based on the URI scheme. In case of conflicting labels based on Content-ID and Content-Location headers, use the rules in section 8.3 of RFC2557 to resolve the conflict.

The example in section 2 illustrates the use of the CID reference in the body of the SOAP 1.1 message. Clearly, the example could have used a reference to a remote resource. Here is the example from section 2 above, rewritten using absolute URIs referencing entities labeled using Content-Location headers:

MIME-Version: 1.0
Content-Type: Multipart/Related; boundary=MIME_boundary; type=text/xml;
        start="<http://claiming-it.com/claim061400a.xml>"
Content-Description: This is the optional message description.

--MIME_boundary
Content-Type: text/xml; charset=UTF-8
Content-Transfer-Encoding: 8bit
Content-ID: <http://claiming-it.com/claim061400a.xml>
Content-Location: http://claiming-it.com/claim061400a.xml

<?xml version='1.0' ?>
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
..
<theSignedForm href="http://claiming-it.com/claim061400a.tiff"/>
..
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

--MIME_boundary
Content-Type: image/tiff
Content-Transfer-Encoding: binary
Content-ID: <http://claiming-it.com/claim061400a.tiff>
Content-Location: http://claiming-it.com/claim061400a.tiff

...binary TIFF image...
--MIME_boundary--

Here is the same example, this time using relative URIs that use the Content-Location header at the base of the MIME Multipart/Related structure for their base URI:

MIME-Version: 1.0
Content-Type: Multipart/Related; boundary=MIME_boundary; type=text/xml;
        start="<http://claiming-it.com/claim061400a.xml>"
Content-Description: This is the optional message description.
Content-Location: http://claiming-it.com/

--MIME_boundary
Content-Type: text/xml; charset=UTF-8
Content-Transfer-Encoding: 8bit
Content-ID: <http://claiming-it.com/claim061400a.xml>
Content-Location: claim061400a.xml

<?xml version='1.0' ?>
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
..
<theSignedForm href="claim061400a.tiff"/>
..
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

--MIME_boundary
Content-Type: image/tiff
Content-Transfer-Encoding: binary
Content-Location: claim061400a.tiff

...binary TIFF image...
--MIME_boundary--

Finally, here is an example that uses relative URIs but no explicit base URI so that rule 4 from section 3 for establishment of base URI applies, causing relative URIs in the SOAP message and Content-Location labels to use the base URI of "thismessage:/":

MIME-Version: 1.0
Content-Type: Multipart/Related; boundary=MIME_boundary; type=text/xml;
        start="<b6f4ccrt@15.4.9.92/s445>"
Content-Description: This is the optional message description.

--MIME_boundary
Content-Type: text/xml; charset=UTF-8
Content-Transfer-Encoding: 8bit
Content-ID: <b6f4ccrt@15.4.9.92/s445>
Content-Location: claim061400a.xml

<?xml version='1.0' ?>
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
..
<theSignedForm href="the_signed_form.tiff"/>
..
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

--MIME_boundary
Content-Type: image/tiff
Content-Transfer-Encoding: binary
Content-ID: <a34ccrt@15.4.9.92/s445>
Content-Location: the_signed_form.tiff

...binary TIFF image...
--MIME_boundary-

Note that within a SOAP message, the fact that a URI reference occurs as the value of a SOAP href attribute does not by itself imply that the receiving SOAP processor must resolve the URI. It is up to the SOAP processor to determine whether resolution of the URI is required, based on the processing semantics of the message. The receiving SOAP processor may choose to ignore the URI even if it is referencing a MIME attachment. Conversely all attachments that appear in the SOAP message package may not be referenced in the root SOAP message.

It is outside the scope of this specification to provide a means for within a SOAP message to explicitly mark it as the root of a SOAP message package, for instance with a distinguished header entry that enumerates message package contents. A separate specification may describe such a mechanism and define message integrity semantics based on it.

4.Relationship to SOAP 1.1

This specification defines an extension to the transport binding mechanisms defined in SOAP 1.1. The packaging of a SOAP 1.1 message in the root part of a Multipart/Related MIME structure along with other content is to be viewed as a specific method for carrying SOAP 1.1 messages in any protocol capable of transferring MIME-encoded content. A SOAP processor that is capable of supporting the both the MIME-based encoding described here and the base transport over which it is carried, must treat the SOAP 1.1 message in the root part as the message to be processed, following all the rules of SOAP 1.1 for the SOAP 1.1 message and for the base transport binding used. An example of the latter is the HTTP binding described in section 6 of SOAP 1.1.

In the next section, we complete the specification of message packages by describing the rules for carrying a compound SOAP message in an HTTP message.

5. HTTP Binding

As in the case of the base SOAP 1.1 specification, this specification does not prescribe either an asynchronous messaging or a synchronous request/response interaction pattern. Our description of the HTTP binding therefore describes the relationship between HTTP headers and the MIME headers used in constructing a SOAP message package, without restricting the interaction pattern in any way.

The basic approach to carrying multipart MIME structure in an HTTP message in this specification is to confine MIME-encoded content to the MIME parts and use the multipart media type header at the HTTP level as a native HTTP header. The rules for forming an HTTP message containing a SOAP message package are as follows:

  1. The Content-Type: Multipart/Related MIME header must appear as an HTTP header. The rules for parameters of this header specified in section 2 apply here as well.
  2. No other headers with semantics defined by MIME specifications (such as Content-Transfer-Encoding) are permitted to appear as HTTP headers. Specifically, the "MIME-Version: 1.0" header must not appear as an HTTP header. Note that HTTP itself uses many MIME-like headers with semantics defined by HTTP 1.1. These may, of course, appear freely.
  3. The MIME parts containing the SOAP message and the attachments constitute the HTTP entity body and must appear as described in section 2, including appropriate MIME headers

It is worth noting that unlike HTTP, MIME semantics apply at the SMTP message level, and therefore for SMTP transport, the multipart MIME headers could simply merge with the SMTP headers.

The following example shows an HTTP message containing a SOAP message package including two attachments that constitutes an automobile insurance claim. The SOAP 1.1 message contains the claim data, and is transmitted along with a facsimile image of the signed claim form (claim.tiff) and a digital photo of the damaged car (car.jpeg).

POST /insuranceClaims HTTP/1.1
Host: www.risky-stuff.com
Content-Type: Multipart/Related; boundary=MIME_boundary; type=text/xml;
        start="<claim061400a.xml@claiming-it.com>"
Content-Length: XXXX
SOAPAction: http://schemas.risky-stuff.com/Auto-Claim
Content-Description: This is the optional message description.

--MIME_boundary
Content-Type: text/xml; charset=UTF-8
Content-Transfer-Encoding: 8bit
Content-ID: <claim061400a.xml@claiming-it.com>

<?xml version='1.0' ?>
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
<claim:insurance_claim_auto id="insurance_claim_document_id"
xmlns:claim="http://schemas.risky-stuff.com/Auto-Claim">
<theSignedForm href="cid:claim061400a.tiff@claiming-it.com"/>
<theCrashPhoto href="cid:claim061400a.jpeg@claiming-it.com"/>
<!-- ... more claim details go here... -->
</claim:insurance_claim_auto>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

--MIME_boundary
Content-Type: image/tiff
Content-Transfer-Encoding: base64
Content-ID: <claim061400a.tiff@claiming-it.com>

...Base64 encoded TIFF image...
--MIME_boundary
Content-Type: image/jpeg
Content-Transfer-Encoding: binary
Content-ID: <claim061400a.jpeg@claiming-it.com>

...Raw JPEG image..
--MIME_boundary-- 

(As in the previous examples the "Content-Type" header line has been continued across two lines so the example prints easily. SOAP message senders should send headers on a single long line.)

5. References

6. Acknowledgements

The authors are grateful for suggestions from Andrew Layman, and Jim Stearns.