W3C | Tutorials @ W3C

Content-Negotiation Techniques to serve XHTML as text/html and application/xhtml+xml

Abstract and status

This document is an attempt to gather the known techniques to serve XHTML documents following the backwards compatibility guidelines as both text/html and application/xhtml+xml with content-negotiation, thus allowing browsers that do not understand the newest MIME-type to get a version with a MIME-type they understand.

This is a work in progress. Please send your comments and suggestions to the publicly archives mailing list <www-qa@w3.org>, or if you don't want your email to be public, to the editor of this document, Dominique Hazaël-Massieux at <dom@w3.org>. We are especially interested in techniques for other popular web servers.

What is Content-Negotiation?

Content-Negotiation is a mechanism defined in the HTTP specification that makes possible to serve different "versions" of a document (or more generally of a resource) at the same URL, so that user agents can choose which version fit their capacities the best.

One of the most classical usage of this mechanism is to serve an image as both GIF and PNG, so that browser that don't understand PNG still gets the GIF version.

To summarize how this works, it's enough to say that user agents are supposed to send an HTTP header (Accept) with the various MIME-type they understand and with indications of how well they understand it. Then, the server replies with the version of the resource that fits the user agents needs.

How does it apply to XHTML 1.0?

XHTML 1.0 (as all the versions of XHTML) is supposed to be served as application/xhtml+xml. But some browsers, among them Internet Explorer, do not recognize this MIME-type. While XHTML 1.0 may be served as text/html when using the backwards compatibility guidelines, it would be nice to serve it as application/xhtml+xml for those browsers that understand this MIME-Type.

One of the best way to do that is to use content-negotiation.

Techniques for Apache

There are several ways to do this in Apache.

Duplicate contents and classical content-negotiation

The traditional way to use content-negotiation in Apache is to have two files with the same name but a different extension, and the MultiViews option set for the directory where these files resides.

For instance, if you have http://example.com/foo/bar.xhtml and http://example.com/foo/bar.html (assuming both extensions xhtml and html are defined with the relevant MIME-Types), the URL http://example.com/foo/bar will offer in content-negotiation the two resources.

Note the following tweaks that are usually necessary:

(Note that this is the way the W3C Home page is currently served).

Can I serve one resource with two distinct MIME-types?

While it's theoretically possible, I don't know any way to do it without breaking some important aspects of HTTP (such as proxying, or the HTTP PUT method) - that is, the method I know using RewriteRules doesn't set headers such as ETag as it should.

Techniques for PHP scripts

For a page served through PHP scripts, it is possible to have the page served both as text/html and application/xhtml+xml depending on the user-agent that requested the page.

To do so in an HTTP-friendly way, the contentNegotiation class can be used to parse the Accept header reliably.

The following code will serve a page as application/xhtml+xml in preference to text/html, except if text/html is preferred, or if the user-agent is identified as MSIE.


require_once("http://www.w3.org/2005/04/conneg.phi"); // copy the class to your server
$conneg = new contentNegotiation();
$uastring = $_SERVER["HTTP_USER_AGENT"];
header("Vary: Accept, User-Agent");
if ($conneg->compareQ("application/xhtml+xml,text/html")=="application/xhtml+xml" && !strpos($uastring,"MSIE")) {
  header("Content-Type:application/xhtml+xml");

} else {
  header("Content-Type:text/html;charset=utf-8");

}

Note how this code sets the Vary header to make it explicit that content negotiation happened; the ETag header would need to be set similarly if your server sets it automatically.

Techniques for Jigsaw

Jigsaw handles by default files with the same basename and a different extension as content-negotiated resources (the same way Apache does when set with Multiviews option). You can also set a prefered variant for each resource by modifying its quality setting (which varies with 0.01 increment).


Dominique Hazaël-Massieux <dom@w3.org>
Created 2003-01-20, Last Modified: $Date: 2007/11/12 07:26:56 $