HTTP Vary header

http://weeble.lut.ac.uk/lists/http-caching/0235.html (in the mail archive) points to

http://www.ast.cam.ac.uk/%7Edrtr/vary-header-02.ps - PostScript
http://www.ast.cam.ac.uk/%7Edrtr/vary-header-02.txt - Plain text
http://www.ast.cam.ac.uk/%7Edrtr/vary-header-02.html - HTML

Daniel DuBois and David Robinson perhaps should revise this to allow the Vary: header to express the concept "this response does NOT vary with the XXXX: header".

>Suppose //www.myhost.com/page.html has two variants: Content-Language: 
>fr and Content-Language: en.
>[...]
>Suppose I made a request to my friendy proxy-cache
>	GET //www.myhost.com/page.html HTTP/1.1
>	Accept-Language: en;q=1,fr;q=.5
>
>and got back the english version, and it had a vary header
>	Vary: Content-Language

 The paramter to the response header
Vary: is a request header, so that would be "Vary: Accept-Language". 

>which it cached along with page.htm.
>Now, someone makes a request to the same cache:
>	GET //www.myhost.com/page.html HTTP/1.1
>	Accept-Language: en
>Can I serve it from the cache?  I think the answer is NO. Because I 

You are correct.

>because the qs multiplied by the higher q for "fr" could make it be the 
>choice in this case.
>[...]
>I think that I can serve up a cached page if every Accept* header in 
>the request has the same choices in the same order with the same or 
>lower q values for all but the first choice, which must have an equal 

I've never worked out the math to figure out what assumptions can be made
about qs's based on any particular set of (q,ql,qe,qc,Q} tuples.  Seems like
too much work.  It would be much better to not make assumptions about qs,
and to have the server tell you the qs's - and that's exactly what the URI
header does.

>Based on these examples, am I correctly understanding how Vary: is 
>supposed to work?

Yes and no.  For the request/variant scenario you listed a server SHOULD NOT
BE USING VARY:  Sorry to shout, but I want it to be real clear.  Vary: is
strictly for those cases where it's hopeless or excessively complicated for
a proxy to replicate what the server would do (other than storing header and
doing strict request header equality comparisons on subsequent requests).
Here's the parallel example for what you described above using the superior
method --> the URI: scheme:

Request to proxy-cache:
	GET http://www.myhost.com/page HTTP/1.1
	Accept-Language: en;q=1,fr;q=.5

Response from origin server:
	HTTP/1.0 200 OK
        [date, allow, server, last-mod, blah, blah]
	Content-Type: text/html
        Content-Language: fr
        Location: http://www.myhost.com/page.fr.html
	URI: {variant "page.en.html" 0.001 {type "text/html"} {language "en"}}
             {variant "page.fr.html" 1.0 {type "text/html"} {language "fr"}}

Now when a proxy gets a subsequent request with
	GET http://www.myhost.com/page HTTP/1.1
	Accept-Language: en;q=1,fr;q=.99

the proxy can now compute qs*ql*qe*qc*q for all variants, and know which
variant to serve, whether it has that variant in cache, etc..

>If so, then I think that for completeness the Vary: proposal  needs to 
>specify what "semantically identical" means for each of the header 
>values that can vary, because for at least some of them, it isn't 
>trivial, as I hope the above example shows.

I would think semantically equal would normally be implemented as strcmp().
Or maybe a homegrown strcmp() that ignores LWS.  An excessively courageous
proxy could special case the Accept-* headers, and actually go through the
trouble of matching "Accept-Language: fr ; q = 0.9, en" to "Accept-Language:
en,fr;q=.90" I suppose.

You are right though, the addition of other headers into the mix makes the
issue not trivial.  If a server says it "Vary: Host" does "Host:
www.myhost.com" equal "Host:    www.myhost.com:80"?  (Ignore that clients
aren't supposed to send the port#)  If a server says it "Vary: From" does
"From: root@spyglass.com (Doug)" equal "From: root@spyglass.com (Doug Brooks)"?

Does anyone know if the issue of "semantically equal" headers (for all
headers, not just Accept-*) been addressed elsewhere so we can leverage that
discussion?

Roy wonders why we need Vary headers at all.
> I'm way behind on my mail, so I don't know if this has been covered,
> but have you gotten a sufficient explanation of why it is that
> requiring URI headers for previously unnamed variations of a resource
> is a bad idea, but why having identifiers for those variations
> exchanged as part of proxy<->origin exchange is a good idea?

Nope, but I only sent it out yesterday.  I'm about 15 days behind
in some of my mail, so I'm not in too much of a hurry to get more. ;-)

......Roy

http working group issues