[XMLHttpRequest] responseText decoding

For responseXML implementations just follow (or have to follow) the rules  
set by the XML specification. Although I suppose text/xml defaulting to  
US-ASCII is probably not followed there and can't, but that's just a bug  
with text/xml in my opinion.


However, for responseText the situation is a bit tricker. Implementations  
currently implement something along the lines of the following algorithm  
(for compatibility with content):

   1. If Content-Type has a charset parameter use that.
   2. Otherwise, if the response is XML follow the application/xml rules.
   3. Otherwise, if Content-Type is not specified or empty follow the
      application/xml rules.
   4. Otherwise, use UTF-8.

This violates for the rules for determining the encoding of a text/css,  
text/html, text/plain document and probably other formats. text/css can  
probably be followed, but text/html and text/plain don't default back to  
UTF-8 which appears to be a problem for deployed content.

Personally I would like to have things for XMLHttpRequest work similarly  
to other content (not fetched through XMLHttpRequest), but it seems this  
might be tricky to get right.

Suggestions welcome.


-- 
Anne van Kesteren
<http://annevankesteren.nl/>
<http://www.opera.com/>

Received on Wednesday, 21 March 2007 11:29:14 UTC