RE: ACTION-308 (part 2) Updates to 'The Self-Describing Web' from Larry Masinter on 2010-01-06 (www-tag@w3.org from January 2010)

From: Larry Masinter <masinter@adobe.com>
Date: Wed, 6 Jan 2010 13:33:07 -0800
To: John Kemp <john@jkemp.net>, "www-tag@w3.org WG" <www-tag@w3.org>
Message-ID: <C68CB012D9182D408CED7B884F441D4D30956C@nambxv01a.corp.adobe.com>

I am strongly opposed to promoting content-type sniffing to be
an architectural principle.

I find it only marginally acceptable to ALLOW content-type sniffing
by conforming receiving agents, when there is clear, compelling and
overwhelming evidence that there is a significant amount of
of content that *needs* sniffing, and in that case, the "sniffing"
specification should not *mandate* sniffing but merely allow it,
and discourage its use.

However, no future design, context, application, W3C recommendation
or other specification should be encouraged to "sniff" content
and interpret message content based on unreliable heuristics
overriding unambiguous content labels.

Larry
--
http://larry.masinter.net


-----Original Message-----
From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] On Behalf Of John Kemp
Sent: Monday, January 04, 2010 12:47 PM
To: www-tag@w3.org WG
Subject: ACTION-308 (part 2) Updates to 'The Self-Describing Web'

Hello,

As the second part of ACTION-308, I propose the following updates to 'The Self-Describing Web' finding [SelfDescWeb], to acknowledge the reality of content-type sniffing. I shall now mark ACTION-308 to be 'pending review'.

Regards,

- johnk

[SelfDescWeb] - http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html
[ACTION-308] - http://www.w3.org/2001/tag/group/track/actions/308
[F2FMinutesSep2009] - http://www.w3.org/2001/tag/2009/09/24-minutes#item03

(begin proposed changes)

1.

Section 1: Introduction

After bullet point:

Each representation should include standard machine-readable indications, such as HTTP Content-type headers, XML encoding declarations, etc., of the standards and conventions used to encode it. 

Add:

... and every effort should be made to ensure that the intentions of the content author and publisher regarding interpretation of the content are accurately conveyed in such indications.

2.

Section 2: The Web's Standard Retrieval Algorithm

After paragraph:

Consider instead a different example, in which Bob clicks on a link to ftp://example.com/todaysnews. Although Bob's browser can easily open an FTP connection to retrieve a file, there is no way for the browser to reliably determine the nature of the information received. Even if the URI were ftp://example.com/todaysnews.html the browser would be guessing if it assumed that the file's contents were HTML, since no normative specification ensures that data from ftp URIs ending in .html is in any particular format. 

Add:

As noted above, and for other reasons (such as content aggregation), it may not be possible for a browser to reliably determine, via inspection of a Content-Type HTTP header or other external metadata alone, the intended interpretation of Web content. In such cases, a browser may inspect the content directly (commonly known as "sniffing"). The consequences of such an action are described in [AuthoritativeMetadata]. In particular, sniffing Web content should only be done using an accepted and secure algorithm, such as [BarthSniff].

3.

References:

Add:

[BarthSniff] http://tools.ietf.org/html/draft-abarth-mime-sniff-03

Received on Wednesday, 6 January 2010 21:33:49 UTC