Authoritative Metadata

1 Summary of key points

The following are the key architectural points of this finding:

Representation metadata received in an encapsulating container, such as within the header fields of a message, is authoritative in defining the nature of the representation received.
Inconsistency between representation data and metadata is an error that should be discovered and corrected rather than silently ignored.
It is an error for an agent to ignore or override authoritative metadata without the consent of the party the agent represents.
Specifications MUST NOT work against the Web architecture by requiring or suggesting that a recipient override authoritatve metadata without user consent.

2 Defining authoritative metadata

The sequence of numbers "324033" might be a license plate number in the state of Arkansas or an old-style telephone number in Italy. Although there do exist some self-descriptive data formats, we generally rely on context to define the purpose, format, and meaning of data. One way to provide a context for interpretation is metadata.

Metadata is simply defined as data about other data. Metadata can be expressed while referencing data externally, while encapsuling data in a container, and by embedding metadata within the data being described. The following table provides examples of how various forms of metadata can be expressed during Web interactions:

metadata
describes	how	where	example
resource	external reference	message fields	HTTP's "Allow" header field in a response message describes the request methods allowed by the resource for which the response was generated.
		data format	Link relationship values (rel/rev attributes) are often used to describe metadata relationships between resources.
		other sources	RDF can associate metadata with a resource by reference to its URI.
message	encapsulating	layers	Protocols are often implemented as a stack of layered protocols, with each lower-layer protocol providing context for higher layers.
	embedded	message syntax	HTTP's response messages begin with "HTTP/" and a version number.
	embedded	message fields	HTTP's "Date" header field describes the clock time at the origin when the message was generated.
representation	external reference	identifiers	Schemes based on old (non-metadata) protocols, such as gopher and ftp, include or imply metadata information about the representation as part of the identifier.
	external reference	data format	Type attributes are sometimes used to express expectations about representation types for pre-access content selection.
	encapsulating	message fields	HTTP and MIME use the value of the "Content-Type" header field to indicate the representation's media type.
	encapsulating	archival formats	Archives often include catalog data that associates metadata with parts of the archive.
	embedded	data format	Magic numbers, DOCTYPEs, and XML namespaces are all means for making data formats self-descriptive. HTML's "META" elements and RDF/XML assertions can describe metadata about the enclosing representation.

The table above demonstrates that the same metadata may be expressed in various forms. The representation media type [RFC2046], in particular, plays such an important role in the Web architecture that its value can be described in many different locations. Given multiple sources of metadata and the possibility that those sources may be inconsistent, the architect must decide what source of metadata has the highest priority and thus shall be considered authoritive in determining the desired behavior of the recipient. Furthermore, given the presence of self-descriptive data formats, a decision must be made on whether to respect the declared metadata over whatever might be learned by inspecting the data itself.

For Web architecture, a design choice has been made that metadata received in an encapsulating container MUST be considered authoritative and used in preference to metadata found by inspection of the data, declared by embedded metadata, or provided by external reference. Although this design choice is generally applicable to any container format, including archival formats that encapsulate other data, the most significant interpretation for Web architecture is that representation metadata found within the header fields of a received message shall be considered authoritative for the representation encapsulated within that message.

Representation metadata does not constrain the receiving agent to process the representation data in one particular way. What it does is allow the sender of a representation to express its intentions regarding how the data should be interpreted by a recipient. A recipient can then choose, based on its own purpose, design, and configuration, how it will react to those intentions on behalf of the party employing the agent. For example, a browser traversing a link may behave differently depending on how the link was selected, a maintenance spider may ignore a data format's rendering instructions, and an editor may treat every representation as a source for editing rather than display.

This treatment of authoritative metadata applies equally to clients, servers, and intermediaries. A server receiving a representation MUST respect the client's expressed intentions regarding the metadata for that representation and either act in accord with those intentions or respond with an appropriate redirection or error message.

3 Why metadata from an encapsulating container is authoritative

The rationale for our choice of authoritative metadata is difficult to describe using abstractions. Let's consider a specific example of the media type of a received representation and explain why each of the other sources of metadata are not considered authoritative.

3.1 Role of Internet Media Types

An Internet media type [RFC2046] is a short name, such as "text/html", that is associated with a data format specification and processing model through registration in the IANA media type registry. For example, "text/html" in the IANA registry is associated with [RFC2854], which in turn states that:

The text/html media type is now defined by W3C Recommendations; the latest published version is [HTML401].

The media type indicates the intended processing model for a representation, including such issues as whether the data should be rendered, stored, or executed. In practice, media types are thus usable for selecting handlers to implement those functions. A media type, therefore, is not simply an indication of data format; it also refers to a standardized interpretation of that data format. In fact, many different media types share a single data format, while others represent a superset of formats.

"If and only if the media type is not given by a "Content-Type"
field, the recipient MAY attempt to guess the media type via inspection of
its content and/or the name extension(s) of the URI used to identify the
resource."

In other words, when there is no authoritative metadata, the receiving agent MAY attempt to guess the appropriate metadata based on inspection of the data and/or the reference, though such guessing should be limited to media types that are safe to use in that context.

4 Overriding authoritative metadata

Recognition of authoritative metadata is important because it influences the default processing behavior for Web interactions. However, representation metadata is also susceptible to misconfiguration, and user agents frequently try to "simplify" the Web by automatically "correcting" perceived "errors" in those configurations.

Choosing to ignore or override authoritative metadata is only allowed within the Web architecture when the user has given consent. Recipients SHOULD detect inconsistencies between representation data and metadata but MUST NOT resolve them without the consent of the user.

4.1 Inconsistency between representation data and metadata

Although there are benefits to separating representation metadata from data, there are risks as well. In particular, the resource owner may create inconsistencies by misconfiguring resources or by failing to reassign metadata after a change of representation. Inconsistency between representation data and metadata is an error. Examples of inconsistencies between metadata and representation data that are frequently observed on the Web include:

The character encoding of text-based content being inconsistent with metadata about the character encoding. For some formats, such as XML, such inconsistencies can be quickly detected.
Server-wide default metadata being incorrectly assigned to new or rarely-used media types or content encodings.
Superset media types being used when a more specific media type is intended, such as the use of "application/xml" when there exists a more specific media type corresponding to the root element.

4.2 Reducing inconsistency

Web software developers, webmasters, and resource owners can help reduce inconsistency through careful assignment of representation metadata. In particular:

Server software designers SHOULD NOT specify default representation metadata, such as media type, character encoding, or content language, within the standard configuration shipped with the server.
Server software designers SHOULD provide a means to set representation metadata at the same level of granularity and permission that is needed to author those representations.
Server managers SHOULD NOT specify an arbitrary Internet media type (e.g., "text/plain" or "application/octet-stream") when the representation media type is unknown.
Server managers SHOULD provide each author with the means and permission to set the configuration of metadata for any representations under the author's control.
Resource owners SHOULD test for correct metadata and inform server managers of metadata misconfigurations.
Authoritative metadata SHOULD NOT be provided external to the representation if it does not add clarity to that communication. For example, the character encoding of XML data formats is self-descriptive within the data and SHOULD NOT be included in a charset parameter of the media type unless that distinction is significant to the resource (e.g., for comparison during content negotiation of multiple XML representations that differ only by character encoding).

4.3 Avoiding silent recovery

As described above, inconsistency between representation data and metadata is an error. However, the tendency for some agents to attempt silent recovery from such errors is also an error. Silent recovery from error perpetuates what could be easily fixed if the resource owner is simply informed of that error during their own testing of the resource.

Web agents SHOULD have a configuration option that enables the display or logging of detected errors. Such a display need not be disruptive of the user experience; for example, a graphical browser might display a small "bug" button in the user interface to indicate a detected error so that an interested user (i.e., the resource owner) can select the button, inspect the error, and perhaps modify the agent's choice on how to recover from that error.

Some applications of the Web cannot tolerate error. For example, medical information systems must be designed so as to detect errors that might cause relevant information to be rendered invisible. In general, it is better to design Web systems that are capable of fulfilling more stringent requirements, even if their default configuration is to be lenient.

4.4 Obtaining user consent

A user agent represents the user for protocol-level interactions with resource providers. A user agent that does not respect the Web protocol specifications can violate user privacy, introduce security holes, and otherwise create confusion. For example, a broken user agent could trigger a security failure by ignoring a received "Content-Type" header with value "text/plain", guessing that representation data is a shell script, and then executing the script on the user's machine without the user's awareness. The other agents in the system (origin server and intermediaries) have sent or forwarded the message with the expectation that the user agent will not attempt to execute the script, at least not without some additional action deliberately chosen by the user. If the user agent violates those expectations, it violates the protections that may have been put in place for the user's self-protection.

Because of those risks, it is an error for an agent to ignore or override authoritative metadata without the consent of the party employing the agent.

Consent does not imply that the receiving agent must interrupt the user and require selection of one option or another. User consent may be achieved in the form of pre-selected configuration options, modes, or selectable user interface toggles, with appropriate reporting to the user when the agent detects an error. Naturally, the appropriate consent mechanism will be unique to each type of receiving agent and application context. It is therefore beyond the scope of this finding to anticipate the range of possible errors and ways in which interface designers might obtain user feedback to address them.

5 Metadata hints in specifications

Some format specifications allow content authors to provide metadata hints for servers and clients. For instance, the http-equiv attribute of the HTML meta element was intended for servers (not clients). In HTML 2.0 [RFC1866], section 5.2.5, the attribute is specified as follows:

HTTP servers may read the content of the document <head> to generate header fields corresponding to any elements defining a value for the attribute HTTP-EQUIV.

The HTML 4.01 link element has an attribute type that gives clients a hint about the likely media type if one were to retrieve a representation of the identified resource.

Example: Format specifications cannot redefine authoritative metadata

The MyFormat specification specifies a type attribute with external references that supposedly takes precedence over any other media type received as authoritative metadata. When type is present, receiving agents are instructed to use its value and ignore any conflicting metadata provided by the sender.

The MyFormat specification designers rationale for this departure from Web architecture is that such a definition of the type attribute allows content authors to work around misconfigured servers. They contend that this is necessary because, in many environments, content authors may not have sufficient access to the server configuration to assign the correct media type where it belongs.

Should the MyFormat specification designers be allowed to ignore a principle of Web architecture and define type in this way just to remedy a potential configuration problem?

Answer: Errors involving inconsistent metadata cannot be "fixed" by adding metadata to external references --- the metadata is inconsistent for all recipients of the message, not just the user agent. An agent that silently overrides server-provided metadata can create security risks and prevent errors from being detected and corrected.

A format specification that includes metadata hints for clients must make clear that, when these hints interact with server metadata, they are advisory only. Format specifications MUST NOT include requirements for clients to override server metadata without user consent.

An architecturally sound description of an advisory attribute might read:

The author may provide a hint to the client about the likely Internet media type of representations of the designated resource. Although the client MUST treat server metadata (including that provided by the file system) as authoritative, the client MAY use the hint in a number of ways, including as a preference when negotiating with the server, as input to a decision to retrieve a representation, or to recover from a misconfigured server. However, the client MUST NOT override the server's authoritative metadata without the consent of the user.

A good example of such a description can be found in the W3C Recommendation Speech Recognition Grammar Specification Version 1.0 [SRGS10], which describes agent behavior that is consistent with this finding in section 2.2.2.

In contrast, the W3C Recommendation Synchronized Multimedia Integration Language (SMIL 2.0) [SMIL20] is inconsistent with this finding. The definition of the type attribute in section 7.3.1 specifies that the value of type takes precedence over authoritative metadata for some protocols. The specification is in error. Under no circumstances can a format specification change the meaning of protocol interaction on the Web. Implementers MUST disregard that statement in SMIL 2.0 and treat the type attribute as merely a means for content selection or for when authoritative metadata is unavailable.

6 Scenarios

The scenarios in this section illustrate some issues that arise when the architectural points described in this finding are ignored.

6.1 Bad server configuration

Stuart runs his own Web server at "http://www.example.org/". He creates an HTML page and means to serve it as "text/html", but misconfigures the Web server so that the content is served via HTTP/1.1 [RFC2616] as "text/plain". Janet's browser retrieves the page and displays the content as plain text. Tim's browser retrieves the page, detects some markup that suggests it is an HTML document (e.g., a <!DOCTYPE declaration or <html> element) and, without informing Tim, proceeds as though the content was declared to be "text/html", rendering it according to the HTML and CSS specifications.

Which party has neglected a principle of Web architecture: Stuart for the server misconfiguration, Tim's browser for silently overriding the HTTP headers from the server, or Janet's browser for not detecting that the content looked like HTML?

Answer: By silently overriding metadata from the representation provider in the HTTP headers, Tim's browser did not respect Web architecture principles that promote shared understanding and security.

Misconfiguration of the server is a fixable error. If Stuart was using Janet's browser, he would see that error immediately and fix it. However, if Stuart uses the same browser as Tim for his testing, Stuart would not be informed of the error. Tim's browser is the culprit here because it misrepresents the resource owner by ignoring the authoritative metadata without Tim's consent. Janet's browser respected the "Content-Type" header field and, in doing so, helps Janet detect a server misconfiguration.

6.2 Good server configuration

Stuart runs his own Web server at "http://www.example.org/". He creates a text page that describes an example of a security vulnerability in a client-side scripting language using sample code. Since Stuart wants users to read the code, not execute it, he assigns the media type "text/plain" to the representation. Janet's browser retrieves the page and displays the content as plain text. Tim's browser retrieves the page, detects the script language, and executes it, promptly sending a rude message to everyone on Tim's address list (including Tim's mom).

Which party has neglected a principle of Web architecture: Stuart for serving content about a vulnerability or Tim's browser for silently overriding the HTTP headers from the server?

Answer: By silently overriding metadata from the representation provider in the HTTP headers, Tim's browser did not respect Web architecture principles that promote shared understanding and security.

Authoritative metadata is an important aspect of Web architecture. Agents that ignore authoritative metadata are broken, sometimes dangerously so, and should not be used. Software cannot assume that a configuration is wrong just because it is unusual.

6.3 Misconfiguration and metadata hints

Norm publishes an XHTML document that includes this link:

<link href="cool-style" type="text/css" rel="stylesheet"/>

Although the link refers to an XSLT style sheet, Norm has set the type attribute to "text/css". Stuart has configured the Web server so that the style sheet is served via HTTP/1.1 as "application/xslt+xml". With a user agent that understands XSLT but not CSS, Janet requests the content that includes this link. As it interprets the representation data, Janet's user agent reads the type hint and does not fetch the style sheet."

Which party is responsible for the fact that Janet did not receive content she should have: Stuart for the server configuration, Norm for stating that the style sheet is served as "text/css" when in fact it's served with a different media type, or Janet's user agent for not double-checking the media type with the server?

Answer: Norm's mislabeling of content deprived Janet of content she should have received.

Norm is responsible for Janet not having access to representation data she was meant to receive. The HTML 4.01 Recommendation states that "Authors who use [the type] attribute take responsibility to manage the risk that it may become inconsistent with the content available at the link target address." Janet's client could have done more than merely read the type hint and decide to skip the style sheet. Users benefit from clients that allow different configurations for handling hints, including:

Query the server, and when there is an inconsistency, choose the authoritative metadata, or
Query the server, and when there is an inconsistency, prompt the user for instructions on how to proceed.

6.4 Conflicting metadata during distributed authoring

[unfinished]

The meaning of any HTTP message is defined by the contents of that message as interpreted according to the HTTP standard. If a client requests that a server store a representation at a given URI and the server's configuration states that the given URI implies metadata inconsistent from what has been provided by the client, then the server should reject the request using an appropriate HTTP status code.

In other words, if a webdav client performs a

   PUT /something.html HTTP/1.1
   Host: example.org
   Content-type: application/pdf
   ...

and example.org knows that it has been configured such that all resources with identifiers ending in in ".html" are represented in the "text/html" format, then the server has four choices:

ignore the "application/pdf" metadata provided by the client, store the representation as-is, and serve it later as "text/html".
change the configuration such that future 200 responses to GET /something.html will be served as "application/pdf", thus preserving the client's stated intent.
accept the request only in the sense of it being a requested change of resource state, resulting in the PDF representation being "converted" to HTML for later responses.
respond with "415 Unsupported Media Type" and a message stating why the request is inconsistent with the resource.

(1) is clearly a bad idea because the inconsistency is an error and failing to report an error is bad design.

(2) may be feasible on some HTTP servers that combine configuration for both authoring and read-only services, but most production HTTP servers do not work that way, and automatically overriding a server configuration is more likely to hide pilot-error rather than do what the user actually wants.

(3) is a complicated option that preserves REST semantics but not those of a dumb filesystem. It is one of those server-side magic tricks that tends to annoy people who think HTTP is a file protocol, which suits me just fine provided that it isn't mandatory.

(4) properly informs the user of the inconsistency (enabling them to choose the right workaround), works in all cases, but wastes some bandwidth.

Answer: (1) is a bug, (2) is bad implementation, (3) is a nifty feature when the user is making an informed request, and (4) is the right answer in all other cases.

7 Future Work

The TAG is working with the authors of [RFC3023] to revise section 7.1 of that RFC, which suggests behavior regarding character encoding metadata that is inconsistent with this finding.

8 References

IANA: Internet Assigned Numbers Authority (IANA) (See http://www.iana.org/.)
RFC1866: T. Berners-Lee, D. Connolly. Hypertext Markup Language - 2.0, RFC1866, November 1995. (See http://www.ietf.org/rfc/rfc1866.)
RFC2046: N. Freed, N. Borenstein. Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types, RFC2046, November 1996. (See http://www.ietf.org/rfc/rfc2046.txt.)
RFC2119: S. Bradner. Key words for use in RFCs to Indicate Requirement Levels, RFC2119, March 1997. (See http://www.ietf.org/rfc/rfc2119.txt.)
RFC2616: R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee. Hypertext Transfer Protocol -- HTTP/1.1, RFC2616, June 1999. (See http://www.ietf.org/rfc/rfc2616.txt.)
RFC2854: D. Connolly, L. Masinter. The 'text/html' Media Type, RFC2854, June 2000. (See http://www.ietf.org/rfc/rfc2854.txt.)
RFC3023: M. Murata, S. St. Laurent, D. Kohn. XML Media Types, RFC3023, January 2001. (See http://www.ietf.org/rfc/rfc3023.txt.)
SMIL20: J. Ayars et al. Synchronized Multimedia Integration Language (SMIL 2.0), Second Edition, W3C Recommendation, 7 January 2005. (See http://www.w3.org/TR/2005/REC-SMIL2-20050107/.)
SRGS10: A. Hunt, S. McGlashan eds. Speech Recognition Grammar Specification Version 1.0, W3C Recommendation, 16 March 2004. (See http://www.w3.org/TR/2004/REC-speech-grammar-20040316/.)

9 Acknowledgments

The first edition of this finding was edited by Ian Jacobs and included substantial input from Roy T. Fielding, Stuart Williams, and Dan Connolly. Martin Dürst, Philipp Hoschka, Rob Lanphier, and Norman Walsh provided reviews of prior drafts that improved this finding. This second edition has additionally benefited from the comments of Noah Mendelsohn.