This is a collection of notes from chatting with TimBL and Sandro. It will probably evolve into something genuinely useful.
HTTP and SMTP provide a mechanism for a document source to associate a mime type with transmitted data. This allows downstream processors to interpret the data correctly without inspecting the data and guessing the contents. In a common conventional HTTP scenario, an agent requests a document with a preference for image/*. The document server takes this preference into account when serving, for instance, an image/png document.
A growing number of data formats, for instance, SVG and MathML, are encoded in XML with specified mime types. This mime type is used by an agent requesting image/* or image/svg+xml forms of the document. Many XML processors could act on this document if the mime type text/xml was associated with it. This leads to a conflict in defining the media types for emerging XML data formats. Following is a list of mime tree setments that apply to such a document:
text/us-ascii or utf-8 and could be presented directly to the user without causing undue stomach upset.text/xmltext/ mime tree.application/xmlimage/image/svgtext/xml and application/xmlThere are no rules beyond encoding restrictions for deciding between text/xml and application/xml; both are registered in RFC 3023. In practice, the majority of the XML [I've seen] satisfies the requirements for text/.
Labeling XML as application/xml conveys the preference that this data not be presented to the user. The application not designed to handle application/xml should default to the application/octet-stream handler. For data with no mopre precise mime type, the data source may choose to pick text/xml if the data consumer (or intervening proxies) may benefit from treating it as a text document. For instance, an example docbook document starts with
<articleinfo> <title>XML From Your Palm</title> <pubdate>11 Oct 2000</pubdate> <releaseinfo role="meta"> $Id: 06-XML-document-interpretation.html,v 1.22 2001/04/11 04:47:00 eric Exp $ </releaseinfo>
which is certainly helpful to the reader with the naive browser.
+xml mime type modifierrfc3023, section 7 attempts to make XML data available to generic XML processors as well as negotiation schemes for higher level data formats by appending +xml to the end of the mime type. This introduces another level of mime hierarchy to address the common scenario where there is one more desired level of mime type above XML, eg SVG and MathML.
The arguments supporting the Network Working Group's decision on +xml are documented in RFC 3023 Appendix A.
XML namespaces provide a reliable mechanism for identifying XML data formats. This enables multiple data formats to be embedded in a single XML document. For instance, an XHTML document may come from a document source with a mime type of text/html. The agent processing this document can furthur distinguish the document as XHMTL by encountering a root element identified by the tuple (http://www.w3.org/1999/xhtml, html). The agent may be able to handle other data formats embedded in the document. An agent with the facilities for rendering SVG will know what to do if it encounters the tuple (http://www.w3.org/2000/svg, SVG) in the document. MathML may be embedded in XHMTL via a similar mechanism.
text/xml modelAn alternative to the +xml mime types is to assert that all data formats with an XML encoding use text/xml for that encoding. The document's root node then clarifies the data format. Naturally, this leaves the more precise data format unavailable to metadata queries and content negotiation.
One of the decision points on deciding to use special mime types has been whether the media introduces media-specific fragment identifiers (exhibit A and exhibit B). Another has, naturally been compatibility with the application catagory in the registered tree (exhibit C).
The two most popular browsers leave content providers with a choice of supporting one or the other, but not both. Here is the behavior for different content types:
| browser | mime type | behaviour |
|---|---|---|
| Netscape | text/xml | dispatches on namespace. for example renders MathML upon finding a (http://www.w3.org/1998/Math/MathML, math) tuple. |
| text/html | does not invode namespace handler. MathML markup is rendered as un-understood tags. | |
| IE | text/xml | will render XHTML deleivered as text/xml if it has a <?xml-stylesheet ""?> |
| text/html | has MathML tags patched into html machine. |
(http://www.w3.org/1998/Math/MathML, math) tuple.text/xml and application/xml media types+xml suffix convention (which also updates the RFC 2048 registration process), and (3) the discussion of "utf-16le" and "utf-16be".Accept to help the document server return the most appropriate resource.Referenced texts that were not available by http URL:
Binary media types are tight shoes for SVG (text, xml, image, svg), event tighter for an XHTML document with MathML and XVG embedded. There will be growing discomforts and lost interoperability opportunities as long as there is no n-ary document description in the (conventional) metadata. One route out is to make unconventional metadata conventional (GET-META and accept recombination) or use an extended header: 13-Alternative-Content-Type
the mime tree is already overloaded with the non-IETF subtrees: vendor (*/vnd.*) and personal (*/prs.*). */xml.* (for instance, image/xml.svg) could collide with these subtrees.
CVS revision: $Id: 06-XML-document-interpretation.html,v 1.22 2001/04/11 04:47:00 eric Exp $
Last modified: Wed Apr 11 00:37:47 EDT 2001