Re: PROPOSAL: i74: Encoding for non-ASCII headers from Roy T. Fielding on 2008-04-03 (ietf-http-wg@w3.org from April to June 2008)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Thu, 3 Apr 2008 16:40:43 -0700
To: Mark Nottingham <mnot@mnot.net>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <ADE32394-C6A9-4A0B-84B7-3F7589506B35@gbiv.com>

On Apr 3, 2008, at 3:58 PM, Mark Nottingham wrote:
>
> Yes, that's one path we can take, but we need to make that decision.

Not really.  I'll reiterate again -- it makes absolutely no sense
whatsoever to be micromanaging the unused details of octet encoding
when the specification parts that define that very thing are already
on the hook to be removed/rewritten in accordance with the draft
partitioning.  This is not the low hanging fruit we are looking for.

The parsing algorithm will not say anything about C1 controls because
no known implementation of HTTP checks for C1 controls.  HTTP cares
about ASCII ":" and CRLF.  This means that the message parser will
carve the message into envelope, headers, and body before we can even
think about other encodings.  We are then left with four places in
which encodings can be defined: request/response line, defined header
fields, the grammar for extension header fields, and the message body.

Each of those can be dealt with in isolation, when we've reached the
point where we can make decisions based on what has been implemented.

It is important to understand that there are no generic field-content
parsers for HTTP, so specifying generic requirements that are not
backed up by implementations is a bad idea.  The specification can be
dramatically simplified (and made more future-compatible with UTF-8)
by removing the unused requirements that are not found in
implementations.  Adding more generic field encoding requirements,
which are neither needed nor backed by current practice, is a bad idea.

....Roy

Received on Thursday, 3 April 2008 23:41:20 UTC