W3C

Describing Media Content of Binary Data in XML

W3C Working Group Note 2 May 2005

This version:
http://www.w3.org/TR/2005/NOTE-xml-media-types-20050502
Latest version:
http://www.w3.org/TR/xml-media-types
Previous version:
http://www.w3.org/TR/2004/WD-xml-media-types-20041102
Editors:
Anish Karmarkar, Oracle
Ümit Yalçınalp, SAP (formerly of Oracle)

Abstract

This document addresses the need to indicate the content-type associated with binary element content in an XML document and the need to specify, in XML Schema, the expected content-type(s) associated with binary element content. It is expected that the additional information about the content-type will be used for optimizing the handling of binary data that is part of a Web services message.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is a W3C Working Group Note. This document includes the resolution of the comments received on the Last Call Working Draft previously published. The comments on this document and their resolution can be found in the Web Services Description Working Group's issues list and in the section C Change Log. A diff-marked version against the previous version of this document is available.

It has been produced jointly by the XML Protocol Working Group, and the Web Services Description Working Group, which are part of the Web Services Activity.

No further work on this topic is planned at this point. Errors in this document can be reported to the public public-ws-media-types@w3.org mailing list (public archive).

Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document has been produced under the 24 January 2002 Current Patent Practice as amended by the W3C Patent Policy Transition Procedure. Patent disclosures relevant to this specification may be found on the Web Services Description Working Group patent disclosure page and on the XML Protocol Working Group patent disclosure page. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1 Introduction
    1.1 Notational Conventions
    1.2 Requirements
2 Attributes for Declaring Content-Type
    2.1 xmime:contentType Attribute
    2.2 xmime:expectedContentTypes Attribute
3 Declaring Content-Type for Binary Data
    3.1 Role of xmime:expectedContentTypes Schema Annotation Attribute
4 Examples
    4.1 Binary Data with Known Media Type
    4.2 Binary Data with Preferred Media Type
5 Normative References
6 Informative References

Appendices

A Acknowledgements
B Schema
C Change Log (Non-Normative)


1 Introduction

Data sent and received over the Web typically uses the MIME media type defined by [IETF RFC 2046], as the type system. For example, "image/jpeg", "application/pdf". There is a need to indicate the content-type of the XML element content, for example, in messages sent and received by Web services. There is also a need to express the content-type information using [XML Schema: Datatypes] and [XML Schema: Datatypes], which is the type system used by [WSDL 2.0 Part 1]. This would allow XML-based applications, such as Web services, to utilize the widely deployed and supported MIME media type infrastructure.

[XOP] and [MTOM] enables one to serialize binary content (element content that is in a canonical lexical representation of the xs:base64Binary type) in an optimized way using MIME packaging. There is a desire to specify the content-type information of such binary element content in a standard way in the [XML Information Set] and not just in the optimized serialization of that Infoset.

This document specifies:

The XML Schema annotation, xmime:expectedContentTypes, specifies the expected range of values for the xmime:contentType attribute and the expected range of content-type for the binary element content.

Note that the use of this mechanism, in particular the xmime:contentType attribute, does not require the implementation, in whole or in part, of XML Schema. In the absence of XML Schema the type information (xs:base64Binary or xs:hexBinary) may have to be provided via other mechanisms; for example, using xsi:type.

1.1 Notational Conventions

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [IETF RFC 2119].

This specification uses properties from the XML Information Set (see [XML Information Set]). Such properties are denoted by square brackets, e.g. [namespace name].

This specification uses namespace prefixes that are listed in Table 1. Note that the choice of any namespace prefix is arbitrary and not semantically significant (see [XML Information Set]).

Table 1. Prefixes and Namespaces used in this specification
Prefix Namespace Definition
xmime http://www.w3.org/2005/05/xmlmime Defined by this specification
xs "http://www.w3.org/2001/XMLSchema" Defined in the W3C XML Schema specifications [XML Schema: Structures], [XML Schema: Datatypes].
xsi "http://www.w3.org/2001/XMLSchema-instance" Defined in the W3C XML Schema specification [XML Schema: Structures].

Namespace names of the general form "http://example.org/..." and "http://example.com/..." represent application or context-dependent URIs (see [IETF RFC 3986]).

All parts of this specification are normative, with the exception of examples and sections explicitly marked as "Non-Normative".

1.2 Requirements

This section describes the set of requirements that this document addresses.

  1. Define how to indicate the content-type of an XML element content whose type is xs:base64Binary or xs:hexBinary. This is meta-data that may be, but not required to, used by tools to infer the specific content-type of binary data.

  2. Define how to indicate the expected content-type(s) of XML element content whose type is xs:base64Binary or xs:hexBinary in XML Schema. This information is needed to define the set of content-type that a binary data may have. For example, a Web services application may be willing to indicate that the binary data represents an image, but leaves it to a document instance to further specify whether it is "jpeg", or "gif". This meta-data is not required to be present.

  3. Define the acceptable format of content-type values.

  4. Define the relationship between the expected and the actual value of the content-type declared for binary data in XML documents.

2 Attributes for Declaring Content-Type

This section defines two global attribute information items for declaring the content-type of binary data and expected content-type(s) of binary data in XML Schema to address requirements (1) and (2) above. Their usage is addressed in Section 3 Declaring Content-Type for Binary Data.

2.1 xmime:contentType Attribute

The xmime:contentType attribute information item has the following Infoset properties:

  • A [local name] of contentType.

  • A [namespace name] of "http://www.w3.org/2005/05/xmlmime".

The type of the xmime:contentType attribute information item is xs:string with a minimum length of three and all leading and trailing white space characters are ignored.

The [normalized value] of the xmime:contentType attribute information item MUST be a valid Content-Type string, e.g., "image/png", "text/xml; charset=utf-16" as defined by [IETF RFC 2045] and indicates the content-type of the [owner element]. Note that [normalized value] consists of normalized attribute value as defined by [XML Information Set] and does not mean that two equivalent values of xmime:contentType will necessarily be equal.

The xmime:contentType attribute information item allows Web services applications to optimize the handling of the binary data defined by a binary element information item and should be considered as meta-data. The presence of the xmime:contentType attribute does not changes the value of the element content.

2.2 xmime:expectedContentTypes Attribute

The xmime:expectedContentTypes attribute information item has the following Infoset properties:

  • A [local name] of expectedContentTypes.

  • A [namespace name] of http://www.w3.org/2005/05/xmlmime.

The type of the xmime:expectedContentTypes attribute information item is xs:string.

The value and the meaning of the xmime:expectedContentTypes attribute is similar to the value allowed for the 'Accept' HTTP header defined by HTTP 1.1 specification, Section 14.1 (see [IETF RFC 2616]) and MUST follow the production rules defined in that section except for the following:

  1. The prefix "Accept:" MUST NOT be used.

  2. The rule qdtext is changed from: qdtext = <any TEXT except<">> to: qdtext = <any CHAR except<">> This change is made to disallow non-US-ASCII OCTETs.

The xmime:expectedContentTypes attribute information item is intended to be used as part of XML Schema annotation for a binary element information item declaration (see 3 Declaring Content-Type for Binary Data). This attribute information item is meant to allow XML Schema authors to indicate the range of media types and/or associated parameters that are acceptable for the binary data. It serves as a static constrain on the xmime:contentType. Users of this attribute information item are urged to avoid using wild cards (for example, "image/*") as it may lead to interoperability problems. If the set of expected media types is not known, the use of xmime:expectedContentTypes is NOT RECOMMENDED.

3 Declaring Content-Type for Binary Data

Documents that want to specify additional content-type information for binary data SHOULD denote this by using a binary element information item. A binary element information item is an element information item defined with the following additional constraints.

If the media type identified by the value of an xmime:contentType attribute information item is a text based media type then the value of the xmime:contentType attribute information item SHOULD include a charset parameter.

For authoring convenience, two types xmime:base64Binary and xmime:hexBinary are defined in B Schema

Example 1: Element with binary content and xmime:contentType attribute
<?xml version="1.0" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:tns="http://example.com/ct-required"
           xmlns:xmime="http://www.w3.org/2005/05/xmlmime"
           targetNamespace="http://example.com/ct-required">

    <xs:import namespace="http://www.w3.org/2005/05/xmlmime"
                schemaLocation="http://www.w3.org/2005/05/xmlmime"/>

    <!-- This element has binary content and requires the xmime:contentType
         attribute that indicates the content-type of the binary element -->
    <xs:element name="MyBinaryData">
      <xs:complexType>
        <xs:simpleContent>
          <xs:extension base="xs:base64Binary" >
            <xs:attribute ref="xmime:contentType" use="required"/>
          </xs:extension>
        </xs:simpleContent>
      </xs:complexType>
    </xs:element>

</xs:schema>
        

3.1 Role of xmime:expectedContentTypes Schema Annotation Attribute

The xmime:expectedContentTypes attribute is used for annotating XML Schema to indicate the expected range of content-type of the binary element content and the expected range of values for xmime:contentType attribute.

The value of the xmime:contentType attribute, if present, SHOULD be within the range specified by the xmime:expectedContentTypes annotation attribute, if specified in the schema. See Section 14.1 of [IETF RFC 2616] on how to interpret content-type ranges that may be defined with respect to actual content. When the xmime:expectedContentTypes annotation attribute contains a wild card ("*") or a list of acceptable content-type separated by commas (","), the schema SHOULD require the xmime:contentType attribute to be present.

Applications that need to specify expected content-type SHOULD use the schema annotation to declare the range of expected values. xmime:expectedContentTypes annotation attribute MAY be used in conjunction with the declaration of binary element information items or with complex type definitions that are derived from xs:base64Binary or xs:hexBinary in XML Schema. If the xmime:expectedContentTypes annotation attribute is used in both the binary element information item declaration as well as definition of the complex type which the binary element information item belongs to, then the expected range of values defined for the binary element information item MUST be a subset of the expected range of values defined for the complex type.

The xmime:expectedContentTypes annotation can be used in conjunction with either type or element declarations. Certain data-binding frameworks which use static type mappings can be more specific if the xmime:expectedContentTypes annotation is applied to the complexType declarations instead of the element declarations using those types. For this reason, the use of expectedContentTypes on element declarations using named complex types is not recommended. An example is provided in Example 6.

The example below consists of a type definition, PictureType, and an element declaration, Picture. The xmime:contentType attribute is required to be present and specifies the content-type of the binary content. The schema annotation attribute xmime:expectedContentTypes specifies that the media type of the binary content is 'image', and the subtype name is either 'jpeg' or 'png'.

Example 2: Schema declaring an element with binary content and expected media type of "image/jpeg" or "image/png"
<?xml version="1.0" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:tns="http://example.com/wildcard"
           xmlns:xmime="http://www.w3.org/2005/05/xmlmime"
           targetNamespace="http://example.com/wildcard">

    <xs:import namespace="http://www.w3.org/2005/05/xmlmime"
                schemaLocation="http://www.w3.org/2005/05/xmlmime"/>


    <xs:complexType name="PictureType">
       <xs:simpleContent>
           <xs:restriction base="xmime:base64Binary" >
               <xs:attribute ref="xmime:contentType" use="required" />
           </xs:restriction>
       </xs:simpleContent>
    </xs:complexType>

    <!-- This element designates the range of values 
         that the element definition will accept    -->
    <xs:element name="Picture" type="tns:PictureType" 
                xmime:expectedContentTypes="image/jpeg, image/png"/>

</xs:schema>
        

The example document instance below conforms to the element declaration of Picture and specifies that the binary content is of type "image/png".

Example 3: Document instance containing element with binary content-type "image/png"
<?xml version="1.0" ?>
<Picture xmlns="http://example.com/wildcard"
           xmlns:xmime="http://www.w3.org/2005/05/xmlmime"
           xmime:contentType="image/png">/aWKKapGGyQ=</Picture>
        

4 Examples

4.1 Binary Data with Known Media Type

The examples in this section consists of a binary elements whose media type is known in advance to be "image/jpeg".

In Example 4, a fixed media type is specified by declaring it with an annotation in conjunction with the complex type definition. The attribute xmime:contentType is not used as the media type of the binary data is know in advance.

Example 4: Element with binary content, known media type and no xmime:contentType attribute
<?xml version="1.0" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:tns="http://example.com/know-type"
           xmlns:xmime="http://www.w3.org/2005/05/xmlmime"
           targetNamespace="http://example.com/know-type">

    <xs:import namespace="http://www.w3.org/2005/05/xmlmime"
            schemaLocation="http://www.w3.org/2005/05/xmlmime"/>

    <xs:simpleType name="JPEGPictureType" 
            xmime:expectedContentTypes="image/jpeg"> 
        <xs:restriction base="xs:base64Binary"/>
    </xs:simpleType>

    <xs:element name="JPEGPicture" type="tns:JPEGPictureType"/>

</xs:schema>
        

In Example 5, a fixed media type is specified by declaring it with an annotation in conjunction with the element declaration. The attribute xmime:contentType is optionally used in document instances to indicate the media type of the binary data.

Example 5: Element with binary content, known media type and optional xmime:contentType attribute
<?xml version="1.0" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:tns="http://example.com/know-type"
           xmlns:xmime="http://www.w3.org/2005/05/xmlmime"
           targetNamespace="http://example.com/know-type">

    <xs:import namespace="http://www.w3.org/2005/05/xmlmime"
            schemaLocation="http://www.w3.org/2005/05/xmlmime"/>

    <xs:element name="JPEGPicture" type="xmime:base64Binary"
            xmime:expectedContentTypes="image/jpeg" />

</xs:schema>
        

4.2 Binary Data with Preferred Media Type

This example illustrates that binary data with media type 'image/jpeg' is preferred but binary data with media type of 'image/tiff' is also allowed (with a lower preference).

Example 6: Element with binary content and preferred media type
<?xml version="1.0" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:tns="http://example.com/preferred-type"
           xmlns:xmime="http://www.w3.org/2005/05/xmlmime"
           targetNamespace="http://example.com/preferred-type">

    <xs:import namespace="http://www.w3.org/2005/05/xmlmime"
            schemaLocation="http://www.w3.org/2005/05/xmlmime"/>


    <xs:complexType name="JPEGPreferredPictureType"
            xmime:expectedContentTypes="image/jpeg;q=1.0, image/tiff;q=0.8"> 
        <xs:simpleContent>
            <xs:restriction base="xmime:base64Binary" >
                <xs:attribute ref="xmime:contentType" use="required" />
            </xs:restriction>
        </xs:simpleContent>
    </xs:complexType>

    <xs:element name="JPEGPeferredPicture" 
            type="tns:JPEGPreferredPictureType"/> 

</xs:schema>
        

5 Normative References

XML Schema: Structures
XML Schema Part 1: Structures Second Edition, H. Thompson, D. Beech, M. Maloney, and N. Mendelsohn, Editors. World Wide Web Consortium Recommendation, 28 October 2004. (See http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/.)
XML Schema: Datatypes
XML Schema Part 2: Datatypes Second Edition, P. Byron and A. Malhotra, Editors. World Wide Web Consortium Recommendation, 28 October 2004. (See http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/.)
IETF RFC 3986
Uniform Resource Identifiers (URI): Generic Syntax, T. Berners-Lee, R. Fielding, L. Masinter, January 2005. (See http://www.ietf.org/rfc/rfc3986.txt.)
IETF RFC 2045
RFC 2045 - Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies, N. Freed, N. Borenstein, November 1996. (See http://www.ietf.org/rfc/rfc2045.txt.)
IETF RFC 2046
RFC 2046 - Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types, N. Freed, N. Borenstein, November 1996. (See http://www.ietf.org/rfc/rfc2046.txt.)
IETF RFC 2616
Hypertext Transfer Protocol--HTTP 1.1, R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee, June 1999. (See http://www.w3.org/Protocols/rfc2616/rfc2616.html.)
IETF RFC 2119
Key words for use in RFCs to Indicate Requirement Levels, S. Bradner, Author. Internet Engineering Task Force, June 1999. (See http://www.ietf.org/rfc/rfc2119.txt.)
XML Information Set
XML Information Set (Second Edition), J. Cowan and R. Tobin, World Wide Web Consortium Recommendation, 4 February 2004. (See http://www.w3.org/TR/2001/REC-xml-infoset-20011024/.)

6 Informative References

WSDL 2.0 Part 1
Web Services Description Language (WSDL) Version 2.0 Part 1: Core Language, Roberto Chinnici, Martin Gudgin, Jean-Jacques Moreau, Jeffrey Schlimmer, Sanjiva Weerawarana, World Wide Web Consortium Working Draft 3 August 2004 (See http://www.w3.org/TR/2004/WD-wsdl20-20040803/.)
XOP
XML-binary Optimized Packaging, Martin Gudgin, Noah Mendelsohn, Mark Nottingham, Herve Ruellan, W3C Recommendation, 25 January 2005 (See http://www.w3.org/TR/2005/REC-xop10-20050125/.)
MTOM
SOAP Message Transmission Optimization Mechanism, Martin Gudgin, Noah Mendelsohn, Mark Nottingham, Herve Ruellan, W3C Recommendation, 25 January 2005 (See http://www.w3.org/TR/2005/REC-soap12-mtom-20050125/.)

A Acknowledgements

This document is developed by the participants of the joint media types task force formed by Web Services Description and XML Protocol Working Groups. Participants of the taskforce, specifically Martin Gudgin, and Mark Nottingham are gratefully acknowledged.

B Schema

<?xml version="1.0" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:xmime="http://www.w3.org/2005/05/xmlmime"
           targetNamespace="http://www.w3.org/2005/05/xmlmime" >

  <xs:attribute name="contentType">
    <xs:simpleType>
      <xs:restriction base="xs:string" >
      <xs:minLength value="3" />
      </xs:restriction>
    </xs:simpleType>
  </xs:attribute>

  <xs:attribute name="expectedContentTypes" type="xs:string" />

  <xs:complexType name="base64Binary" >
    <xs:simpleContent>
        <xs:extension base="xs:base64Binary" >
            <xs:attribute ref="xmime:contentType" />
        </xs:extension>
    </xs:simpleContent>
  </xs:complexType>

  <xs:complexType name="hexBinary" >
    <xs:simpleContent>
        <xs:extension base="xs:hexBinary" >
            <xs:attribute ref="xmime:contentType" />
        </xs:extension>
    </xs:simpleContent>
  </xs:complexType>

</xs:schema>

C Change Log (Non-Normative)

Changes since publication of Last Call WD.
Who When What
ASK 20050223 Incorporated resolution for issue 260. In section 2.2 removed the sentence starting with "The 'q' parameter defined ..." and the editorial note seeking feedback.
ASK 20050223 Incorporated resolution for issue 259. Section 2.2 is modified to disallow the prefix "Accept:" and the rule qdtext is modified to disallow non-usascii octets
ASK 20050223 Incorporated resolution for issue 266. At the end of the section the following is added: In the absence of XML schema the type information (xs:base64Binary or xs:hexBinary) may have to be provided via other mechanisms; for example, xsi:type.
ASK 20050223 Incorporated resolution for issue 261 specified in http://lists.w3.org/Archives/Public/www-ws-desc/2005Jan/0013.html
ASK 20050223 Incorporated resolution for issue 262 specified in http://lists.w3.org/Archives/Public/www-ws-desc/2005Jan/0014.html
ASK 20050223 Incorporated resolution for issue 263. Changed schema to make contentLength be of minlength of 3 added the stmt: " ... with a minimum length of three and all leading and trailing white space characters are ignored." in section 2.1
ASK 20050223 Resolved issue 253. In section 1.1 added: All parts of this specification are normative, with the exception of examples and sections explicitly marked as "Non-Normative". Created two sections: normative ref and informative ref, moved wsdl 2.0, xop and mtom ref to informative ref section.
ASK 20050223 Removed the namespace prefix for wsdl in the table
ASK 20050223 Resolved issue 254 by removing all ed notes
ASK 20050223 Resolved issue 255 by updating the infoset ref to 2nd ed.
ASK 20050223 Resolved issue 264 by accepting the proposal in the issue email (used the prefix 'xmlmime' for all occurences of expectedMediaType and contentType attribute)
ASK 20050223 Resolved issue 265 by updating the mtom and xop ref to REC
ASK 20050223 Resolved issue 269 by implementing all the suggestions: 1) s/name of the IANA media type token/a valid content-type string 2) title change 3) s/expectedMediaType/expectedContentType 4) s/Declaring media types for binary data/Declaring Content-Type for binary data. Also replaced 'media type' with 'content-type' at bunch of places
ASK 20050228 Resolved issue 270 by including language that clarifies that [normalized value] does not mean normalization of ContentType values
ASK 20050304 Replaced the prefix 'xmlmime' with 'xmime' to address i18n comment
ASK 20050304 Added a 'SHOULD' for 'charset' params for textual types to address i18n comment
ASK 20050308 Added a recommendation to use list of media types over wild cards
ASK 20050309 Misc. ed changes (indentation, capitalization etc), changes to References (updated schema refs to 2nd edition, updated 2396 to 3986), fixed examples so that image/* is not an expectedContentType
ASK 20050309 Added another example in section 4.1 to address Kevin's concern.
ASK 20050309 Moved 2nd and 3rd para from section 3 to section 2.1
ASK 20050310 Per WG decision, changed expectedContentType to expectedConentTypes
ASK 20050310 fixed example 1 bug. s/restriction/extension
ASK 20050310 Added ref to rfc 2045
ASK 20050316 fixed example 4 bug
ASK 20050422 added the agreed upon note about issue with existing tools that bind to prog. languages when the annotation is on the element decl rather than on a named complex type
ASK 20050422 modified example 6 to move the annotation from the element decl to the type definition
ASK 20050422 Included suggestions at http://lists.w3.org/Archives/Public/public-ws-media-types/2005Mar/0021.html