[XMLHttpRequest] responseText decoding from Anne van Kesteren on 2007-03-21 (public-webapi@w3.org from March 2007)

From: Anne van Kesteren <annevk@opera.com>
Date: Wed, 21 Mar 2007 12:29:09 +0100
To: "Web API WG (public)" <public-webapi@w3.org>
Message-ID: <op.tpjd6vf264w2qv@id-c0020>

For responseXML implementations just follow (or have to follow) the rules  
set by the XML specification. Although I suppose text/xml defaulting to  
US-ASCII is probably not followed there and can't, but that's just a bug  
with text/xml in my opinion.


However, for responseText the situation is a bit tricker. Implementations  
currently implement something along the lines of the following algorithm  
(for compatibility with content):

   1. If Content-Type has a charset parameter use that.
   2. Otherwise, if the response is XML follow the application/xml rules.
   3. Otherwise, if Content-Type is not specified or empty follow the
      application/xml rules.
   4. Otherwise, use UTF-8.

This violates for the rules for determining the encoding of a text/css,  
text/html, text/plain document and probably other formats. text/css can  
probably be followed, but text/html and text/plain don't default back to  
UTF-8 which appears to be a problem for deployed content.

Personally I would like to have things for XMLHttpRequest work similarly  
to other content (not fetched through XMLHttpRequest), but it seems this  
might be tricky to get right.

Suggestions welcome.


-- 
Anne van Kesteren
<http://annevankesteren.nl/>
<http://www.opera.com/>

Received on Wednesday, 21 March 2007 11:29:14 UTC