Re: Review of Content-Encoding: value token

On Fri, 23 Jan 2009, Julian Reschke wrote:

> Mark Nottingham wrote:
>> 
>> Yes. If it doesn't preserve characters, all sorts of mess can result, e.g., 
>> with ETag comparison, range retrieval, etc.
>> ...
>
> I was looking at 
> <http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.3.5>...:
>
> "Content coding values indicate an encoding transformation that has been or 
> can be applied to an entity. Content codings are primarily used to allow a 
> document to be compressed or otherwise usefully transformed without losing 
> the identity of its underlying media type and without loss of information."
>
> ...and was asking myself: is perfect reconstruction of the original payload 
> really required? Is there something we need to fix here?

Exact reconstruction is always safest. But it seems to me that a new 
encoding for an html (or xml) document which stripped trailing blanks and 
tabs might satisfy the above constraints. Since ETag creation is the 
responsiblity of the origin server, I don't think the transformation I
describe would matter. The ETag would need to change if the pre-encoding 
content changed.

On the principal that encoding performed by the content owning server
needs to satisfy the semantics of "without loss of informtion" as judged 
by the content owner. I can think of cases where the content owner would
find automatic image scaling to match HTML specified height and width
appropriate encoding ...

I'm not advocating either of the above as anything more than examples of 
new encodings which shouldn't be excluded by a restriction for exact 
reconstruction of the original object.

Dave Morris

Received on Friday, 23 January 2009 06:41:39 UTC