Re: text/* types and charset defaults [i20]

On 20/01/2008, at 10:28 PM, Julian Reschke wrote:

> <http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p3-payload-01.html#rfc.section.2.3.1.p.2 
> > currently says:
>
> "When in canonical form, media subtypes of the "text" type use CRLF  
> as the text line break. HTTP relaxes this requirement and allows the  
> transport of text media with plain CR or LF alone representing a  
> line break when it is done consistently for an entire entity-body.  
> HTTP applications MUST accept CRLF, bare CR, and bare LF as being  
> representative of a line break in text media received via HTTP. In  
> addition, if the text is represented in a character set that does  
> not use octets 13 and 10 for CR and LF respectively, as is the case  
> for some multi-byte character sets, HTTP allows the use of whatever  
> octet sequences are defined by that character set to represent the  
> equivalent of CR and LF for line breaks. This flexibility regarding  
> line breaks applies only to text media in the entity-body; a bare CR  
> or LF MUST NOT be substituted for CRLF within any of the HTTP  
> control structures (such as header fields and multipart boundaries)."
>
> Does this need fixing?

If it does, that can be a separate issue (this text is lifted directly  
from 2616).


> So my proposal would be:
>
> - drop paragraph 4 (ISO-8859-1),
>
> - add a note covering Larry's points 1) and 2), and
>
> - mention this is a normative change in <http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p3-payload-01.html#changes.from.rfc.2616 
> >.


To be clear, we're talking about removing <http://tools.ietf.org/id/draft-ietf-httpbis-p3-payload-01.txt 
 >, section 2.3.1, the entire forth paragraph (i.e., the last one in  
that section). This includes removing both the defaulting and the MUST- 
level requirement for labeling text/* in a charset other than  
ISO-8859-1.

What should happen to section 2.1.1? Given the changes above, the only  
still-relevant part of it seems to be:
> HTTP/1.1 recipients MUST respect the charset label provided by the  
> sender; and those user agents that have a provision to "guess" a  
> charset MUST use the charset from the content-type field if they  
> support that charset, rather than the recipient's preference, when  
> initially displaying a document.
The most straightforward thing to do may be to extract the text above  
and put it at the end of 2.3.1, removing the rest of 2.1.1.



--
Mark Nottingham     http://www.mnot.net/

Received on Tuesday, 22 January 2008 12:22:02 UTC