The Message Content-Type in MIME

7.3 The Message Content-Type

It is frequently desirable, in sending mail, to encapsulate another mail message. For this common operation, a special Content-Type, "message", is defined. The primary subtype, message/rfc822, has no required parameters in the Content- Type field. Additional subtypes, "partial" and "External- body", do have required parameters. These subtypes are explained below.

NOTE: It has been suggested that subtypes of message might be defined for forwarded or rejected messages. However, forwarded and rejected messages can be handled as multipart messages in which the first part contains any control or descriptive information, and a second part, of type message/rfc822, is the forwarded or rejected message. Composing rejection and forwarding messages in this manner will preserve the type information on the original message and allow it to be correctly presented to the recipient, and hence is strongly encouraged.

As stated in the definition of the Content-Transfer-Encoding field, no encoding other than "7bit", "8bit", or "binary" is permitted for messages or parts of type "message". The message header fields are always US-ASCII in any case, and data within the body can still be encoded, in which case the Content-Transfer-Encoding header field in the encapsulated message will reflect this. Non-ASCII text in the headers of an encapsulated message can be specified using the mechanisms described in [RFC-1342].

Mail gateways, relays, and other mail handling agents are commonly known to alter the top-level header of an RFC 822 message. In particular, they frequently add, remove, or reorder header fields. Such alterations are explicitly forbidden for the encapsulated headers embedded in the bodies of messages of type "message."

7.3.1 The Message/rfc822 (primary) subtype

A Content-Type of "message/rfc822" indicates that the body contains an encapsulated message, with the syntax of an RFC 822 message.

7.3.2 The Message/Partial subtype

A subtype of message, "partial", is defined in order to allow large objects to be delivered as several separate pieces of mail and automatically reassembled by the receiving user agent. (The concept is similar to IP fragmentation/reassembly in the basic Internet Protocols.) This mechanism can be used when intermediate transport agents limit the size of individual messages that can be sent. Content-Type "message/partial" thus indicates that the body contains a fragment of a larger message.

Three parameters must be specified in the Content-Type field of type message/partial: The first, "id", is a unique identifier, as close to a world-unique identifier as possible, to be used to match the parts together. (In general, the identifier is essentially a message-id; if placed in double quotes, it can be any message-id, in accordance with the BNF for "parameter" given earlier in this specification.) The second, "number", an integer, is the part number, which indicates where this part fits into the sequence of fragments. The third, "total", another integer, is the total number of parts. This third subfield is required on the final part, and is optional on the earlier parts. Note also that these parameters may be given in any order.

Thus, part 2 of a 3-part message may have either of the following header fields:

      Content-Type: Message/Partial; 
          number=2; total=3; 
          id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; 

     Content-Type: Message/Partial; 
          id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; 
          number=2

But part 3 MUST specify the total number of parts:

     Content-Type: Message/Partial; 
          number=3; total=3; 
          id="oc=jpbe0M2Yt4s@thumper.bellcore.com";

Note that part numbering begins with 1, not 0.

When the parts of a message broken up in this manner are put together, the result is a complete RFC 822 format message, which may have its own Content-Type header field, and thus may contain any other data type.

Message fragmentation and reassembly: The semantics of a reassembled partial message must be those of the "inner" message, rather than of a message containing the inner message. This makes it possible, for example, to send a large audio message as several partial messages, and still have it appear to the recipient as a simple audio message rather than as an encapsulated message containing an audio message. That is, the encapsulation of the message is considered to be "transparent".

When generating and reassembling the parts of a message/partial message, the headers of the encapsulated message must be merged with the headers of the enclosing entities. In this process the following rules must be observed:

(1) All of the headers from the initial enclosing entity (part one), except those that start with "Content-" and "Message-ID", must be copied, in order, to the new message.
(2) Only those headers in the enclosed message which start with "Content-" and "Message-ID" must be appended, in order, to the headers of the new message. Any headers in the enclosed message which do not start with "Content-" (except for "Message-ID") will be ignored.
(3) All of the headers from the second and any subsequent messages will be ignored.

> For example, if an audio message is broken into two parts, the first part might look something like this:

     X-Weird-Header-1: Foo 
     From: Bill@host.com 
     To: joe@otherhost.com 
     Subject: Audio mail 
     Message-ID: id1@host.com 
     MIME-Version: 1.0 
     Content-type: message/partial; 
          id="ABC@host.com"; 
          number=1; total=2 

     X-Weird-Header-1: Bar 
     X-Weird-Header-2: Hello 
     Message-ID: anotherid@foo.com 
     Content-type: audio/basic 
     Content-transfer-encoding: base64 

     ... first half of encoded audio data goes here...

and the second half might look something like this:

     From: Bill@host.com 
     To: joe@otherhost.com 
     Subject: Audio mail 
     MIME-Version: 1.0 
     Message-ID: id2@host.com 
     Content-type: message/partial; 
          id="ABC@host.com"; number=2; total=2 

     ... second half of encoded audio data goes here...

Then, when the fragmented message is reassembled, the resulting message to be displayed to the user should look something like this:

     X-Weird-Header-1: Foo 
     From: Bill@host.com 
     To: joe@otherhost.com 
     Subject: Audio mail 
     Message-ID: anotherid@foo.com 
     MIME-Version: 1.0 
     Content-type: audio/basic 
     Content-transfer-encoding: base64 

     ... first half of encoded audio data goes here... 
     ... second half of encoded audio data goes here...

It should be noted that, because some message transfer agents may choose to automatically fragment large messages, and because such agents may use different fragmentation thresholds, it is possible that the pieces of a partial message, upon reassembly, may prove themselves to comprise a partial message. This is explicitly permitted.

It should also be noted that the inclusion of a "References" field in the headers of the second and subsequent pieces of a fragmented message that references the Message-Id on the previous piece may be of benefit to mail readers that understand and track references. However, the generation of such "References" fields is entirely optional.

7.3.3 The Message/External-Body subtype

The external-body subtype indicates that the actual body data are not included, but merely referenced. In this case, the parameters describe a mechanism for accessing the external data.

When a message body or body part is of type "message/external-body", it consists of a header, two consecutive CRLFs, and the message header for the encapsulated message. If another pair of consecutive CRLFs appears, this of course ends the message header for the encapsulated message. However, since the encapsulated message's body is itself external, it does NOT appear in the area that follows. For example, consider the following message:

     Content-type: message/external-body; access- 
     type=local-file; 
          name=/u/nsb/Me.gif 

     Content-type:  image/gif 

     THIS IS NOT REALLY THE BODY!

The area at the end, which might be called the "phantom body", is ignored for most external-body messages. However, it may be used to contain auxilliary information for some such messages, as indeed it is when the access-type is "mail-server". Of the access-types defined by this document, the phantom body is used only when the access-type is "mail-server". In all other cases, the phantom body is ignored.

The only always-mandatory parameter for message/external- body is "access-type"; all of the other parameters may be mandatory or optional depending on the value of access-type.

ACCESS-TYPE: One or more case-insensitive words, comma-separated, indicating supported access mechanisms by which the file or data may be obtained. Values include, but are not limited to, "FTP", "ANON-FTP", "TFTP", "AFS", "LOCAL-FILE", and "MAIL-SERVER". Future values, except for experimental values beginning with "X-", must be registered with IANA, as described in Appendix F .

In addition, the following two parameters are optional for ALL access-types:

EXPIRATION: The date (in the RFC 822 "date-time" syntax, as extended by RFC 1123 to permit 4 digits in the date field) after which the existence of the external data is not guaranteed.
SIZE: The size (in octets) of the data. The intent of this parameter is to help the recipient decide whether or not to expend the necessary resources to retrieve the external data.
PERMISSION: A field that indicates whether or not it is expected that clients might also attempt to overwrite the data. By default, or if permission is "read", the assumption is that they are not, and that if the data is retrieved once, it is never needed again. If PERMISSION is "read- write", this assumption is invalid, and any local copy must be considered no more than a cache. "Read" and "Read-write" are the only defined values of permission.

The precise semantics of the access-types defined here are described in the sections that follow.

7.3.3.1 The "ftp" and "tftp" access-types

An access-type of FTP or TFTP indicates that the message body is accessible as a file using the FTP [RFC-959] or TFTP [RFC-783] protocols, respectively. For these access-types, the following additional parameters are mandatory:

NAME: The name of the file that contains the actual body data.
SITE: A machine from which the file may be obtained, using the given protocol

Before the data is retrieved, using these protocols, the user will generally need to be asked to provide a login id and a password for the machine named by the site parameter.

In addition, the following optional parameters may also appear when the access-type is FTP or ANON-FTP:

DIRECTORY: A directory from which the data named by NAME should be retrieved.
MODE: A transfer mode for retrieving the information, e.g. "image".

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 7.3.3.2 The "anon-ftp" access-type

The "anon-ftp" access-type is identical to the "ftp" access type, except that the user need not be asked to provide a name and password for the specified site. Instead, the ftp protocol will be used with login "anonymous" and a password that corresponds to the user's email address.

7.3.3.3 The "local-file" and "afs" access-types

An access-type of "local-file" indicates that the actual body is accessible as a file on the local machine. An access-type of "afs" indicates that the file is accessible via the global AFS file system. In both cases, only a single parameter is required:

NAME -- The name of the file that contains the actual body data.

The following optional parameter may be used to describe the locality of reference for the data, that is, the site or sites at which the file is expected to be visible:

SITE -- A domain specifier for a machine or set of machines that are known to have access to the data file. Asterisks may be used for wildcard matching to a part of a domain name, such as "*.bellcore.com", to indicate a set of machines on which the data should be directly visible, while a single asterisk may be used to indicate a file that is expected to be universally available, e.g., via a global file system.

7.3.3.4 The "mail-server" access-type

The "mail-server" access-type indicates that the actual body is available from a mail server. The mandatory parameter for this access-type is:

SERVER -- The email address of the mail server from which the actual body data can be obtained.

Because mail servers accept a variety of syntax, some of which is multiline, the full command to be sent to a mail server is not included as a parameter on the content-type line. Instead, it may be provided as the "phantom body" when the content-type is message/external-body and the access-type is mail-server.

Note that MIME does not define a mail server syntax. Rather, it allows the inclusion of arbitrary mail server commands in the phantom body. Implementations should include the phantom body in the body of the message it sends to the mail server address to retrieve the relevant data. 7.3.3.5 Examples and Further Explanations

With the emerging possibility of very wide-area file systems, it becomes very hard to know in advance the set of machines where a file will and will not be accessible directly from the file system. Therefore it may make sense to provide both a file name, to be tried directly, and the name of one or more sites from which the file is known to be accessible. An implementation can try to retrieve remote files using FTP or any other protocol, using anonymous file retrieval or prompting the user for the necessary name and password. If an external body is accessible via multiple mechanisms, the sender may include multiple parts of type message/external-body within an entity of type multipart/alternative.

However, the external-body mechanism is not intended to be limited to file retrieval, as shown by the mail-server access-type. Beyond this, one can imagine, for example, using a video server for external references to video clips.

If an entity is of type "message/external-body", then the body of the entity will contain the header fields of the encapsulated message. The body itself is to be found in the external location. This means that if the body of the "message/external-body" message contains two consecutive CRLFs, everything after those pairs is NOT part of the message itself. For most message/external-body messages, this trailing area must simply be ignored. However, it is a convenient place for additional data that cannot be included in the content-type header field. In particular, if the "access-type" value is "mail-server", then the trailing area must contain commands to be sent to the mail server at the address given by NAME@SITE, where NAME and SITE are the values of the NAME and SITE parameters, respectively.

The embedded message header fields which appear in the body of the message/external-body data can be used to declare the Content-type of the external body. Thus a complete message/external-body message, referring to a document in PostScript format, might look like this:

From: Whomever 
Subject: whatever
MIME-Version: 1.0
Message-ID: id1@host.com 
Content-Type: multipart/alternative; boundary=42 



     --42
Content-Type: message/external-body; 
          name="BodyFormats.ps"; 

          site="thumper.bellcore.com"; 
          access-type=ANON-FTP; 
          directory="pub"; 
         mode="image"; 
          expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" 


     Content-type: application/postscript 


     --42 
     Content-Type: message/external-body; 
          name="/u/nsb/writing/rfcs/RFC-XXXX.ps"; 
          site="thumper.bellcore.com"; 
          access-type=AFS 
         expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" 


     Content-type: application/postscript 


     --42 
     Content-Type: message/external-body; 
          access-type=mail-server 
          server="listserv@bogus.bitnet"; 
          expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" 


     Content-type: application/postscript 


     get rfc-xxxx doc 


     --42--

Like the message/partial type, the message/external-body type is intended to be transparent, that is, to convey the data type in the external body rather than to convey a message with a body of that type. Thus the headers on the outer and inner parts must be merged using the same rules as for message/partial. In particular, this means that the Content-type header is overridden, but the From and Subject headers are preserved.

Note that since the external bodies are not transported as mail, they need not conform to the 7-bit and line length requirements, but might in fact be binary files. Thus a Content-Transfer-Encoding is not generally necessary, though it is permitted.

Note that the body of a message of type "message/external- body" is governed by the basic syntax for an RFC 822 message. In particular, anything before the first consecutive pair of CRLFs is header information, while anything after it is body information, which is ignored for most access-types.