Re: ISSUE-4: Versioning, namespace URIs and MIME types from Maciej Stachowiak on 2009-02-19 (public-html@w3.org from February 2009)

From: Maciej Stachowiak <mjs@apple.com>
Date: Thu, 19 Feb 2009 07:32:41 -0800
To: Sam Ruby <rubys@intertwingly.net>
Cc: Ian Hickson <ian@hixie.ch>, Larry Masinter <masinter@adobe.com>, HTML WG <public-html@w3.org>
Message-id: <875872DC-4FE1-49D2-AD98-E643F5C863EB@apple.com>

On Feb 19, 2009, at 4:58 AM, Sam Ruby wrote:

> Ian Hickson wrote:
>> On Thu, 19 Feb 2009, Sam Ruby wrote:
>>> The only way forward in situations like this is to start over with  
>>> a new
>>> format.
>> That's one way forward, but not the only one. HTML5's approach has  
>> been to embrace the reality of implementations, and replace  
>> previous specifications with a definitive comprehensive set of  
>> requirements that matches implementations, including their really  
>> strange behaviour (such as "quirks mode" in HTML).
>> It's a whole hell of a lot more work than writing a new language  
>> from scratch, but it has the advantages of not requiring consumers  
>> to support two languages (one of which is effectively undefined)  
>> instead of just one, and of not requiring producers to start from  
>> scratch when updating to the new technologies (the latter is not a  
>> big problem for feed formats, where the feeds typically are  
>> generated from source material, but is quite a big deal for  
>> original-form formats like HTML, where the content exists only in  
>> the form of the "legacy" language).
>
> By snipping in the way you did, you made it appear as it there is a  
> disagreement between you and me on this subject, where in fact,  
> there is none; at least not at the broad brush level.

It seemed to me that you and Ian were largely in agreement as well.  
Atom is an example of starting over in the face of the mess that is  
RSS. It has been successful on a technical level, and has even seen  
some adoption. But RSS is still quite widespread compared to Atom.  
HTML5 is taking the (in some ways) more ambitious approach of  
specifying the mess. These both seem like reasonable strategies for  
their respective problem domains. As Ian said, for feeds, the ability  
to adopt new features incrementally is less essential than for HTML.

....

> For completeness, I feel compelled to state that none of these  
> principles can ever be cleanly applied.  GIF and JPEG can both be  
> used to produce similar effects.  For a number of reasons, there was  
> need for something more, and PNG was created.  How do browsers tell  
> them apart? By looking at the data.  Whether this is a flagrant  
> violation of web architecture (Authoritative Metadata[1]) or is an  
> prime example of Extending and Versioning Languages[2] is a matter  
> of perspective.

I would argue that binary formats are different from text formats.  
Because binary formats make extensive use of magic numbers, generally  
detection by content-sniffing is more reliable than detection via an  
external label like a MIME type. Also, it is rarely the case that a  
single piece of binary data could validly be interpreted according to  
one of several formats. With text formats, one may well have a single  
file that could be served as text/plain, text/html and application/ 
xhtml+xml with different intended processing in each case. With binary  
formats, this just doesn't come up.

Regards,
Maciej

Received on Thursday, 19 February 2009 15:33:25 UTC