Server Push and Caching from Mark Nottingham on 2016-08-24 (ietf-http-wg@w3.org from July to September 2016)

From: Mark Nottingham <mnot@mnot.net>
Date: Wed, 24 Aug 2016 14:50:47 +1000
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <3904FEC0-4362-47A0-886A-B97FB97E2515@mnot.net>

Exploring how Server Push interacts with HTTP caching below. Feedback welcome.

# Pushing Uncacheable Content

RFC7540, Section 8.2 says:

> Pushed responses that are not cacheable MUST NOT be stored by any HTTP cache. They MAY be made available to the application separately.

As a result, any pushed response that cannot be stored as per the rules in RFC7234, Section 3 <http://httpwg.org/specs/rfc7234.html#response.cacheability> cannot be stored by a receiving HTTP cache.

"Being made available to the application separately" could mean many things. It could be that a truly uncacheable response (e.g., with `Cache-Control: no-store`) would bypass the HTTP cache but then be stored by the application in anticipation of a future request, but this might lead to some surprising results for Web developers, because it's effectively specifying yet another kind of browser caching (see separate thread).

However, they might still be usable if a browser API for Server Push emerges. See <https://github.com/whatwg/fetch/issues/51>.

Do we think that an uncacheable response should have any side effects on a client other than such a dedicated push-focused API (e.g., triggering an event)? See also thread on "Scope of Server Push."

# Pushing and Freshness

RFC7540, Section 8.2 says:

> Pushed responses are considered successfully validated on the origin server (e.g., if the "no-cache" cache response directive is present (RFC7234, Section 5.2.2)) while the stream identified by the promised stream ID is still open.

This implies that, while that stream is open, the pushed response can be used by the cache, even when it contains any (or all) of the following cache directives:

* max-age=0
* no-cache
* s-maxage=0 (for shared caches)

The underlying principle here is that while the response stream is still open, it's semantically equivalent to a "normal" response to a just-issued request; it would be senseless to require it to be immediately revalidated before handing it to the application for use.

The cache can also store the response, but once the stream is closed, if that response is stale -- either because of the presence of one of the directives above, or some combination of `Expires`, `Age`, `Date`, and `Cache-Control`, it will need to be revalidated before use.

Interestingly, cache freshness (as defined in RFC7234, Section 4.2) is also related to the concept of "successful validation on the origin server":

> A response's age is the time that has passed since it was generated by, or successfully validated with, the origin server.

We could decide to read this to assume that the pushed response has `Age: 0` no matter what comes across the wire.

So, in summary -- any pushed response can be passed on to the application through a Server Push API. If it's cacheable, it can be stored in the cache, and even if it's stale or it requires validation, it can be served from cache while the response stream is still open (with some wiggle room).

Note that HTTP does not put constraints on _how_ the application uses that response after it comes through the API or the cache; it might use it multiple times (e.g., an image might occur more than once on a page, or more than one downstream client might have made the request). It's just that this reuse isn't in the context of a HTTP cache's operation.

Does this agree with other people's understanding?

# Pushing and Invalidation

When a server wants to remove the contents of a client's cache for a given URL, but doesn't know what it's to be replaced with yet, it needs to invalidate. The only native HTTP mechanism for cache invalidation is described in RFC7234, Section 4.4:

> A cache MUST invalidate the effective Request URI (Section 5.5 of RFC7230) as well as the URI(s) in the Location and Content-Location response header fields (if present) when a non-error status code is received in response to an unsafe request method.

Since it is triggered by unsafe request methods (like POST), this can't be used in Server Push.

We _could_ use this loophole a bit further down:

> A cache MUST invalidate the effective request URI (Section 5.5 of RFC7230) when it receives a non-error response to a request with a method whose safety is unknown.

... by defining a method that is defined to have a method whose safety is unknown (since if it's defined, it either won't be pushable, or won't trigger invalidation). E.g.

~~~
:method: INVALIDATE
:scheme: https
:authority: www.example.com
:path: /thing
~~~

However, doing that might cause problems with IANA, since we'd have to pick a safety value to register.

Another approach would be to push a 404 (Not Found) or 410 (Gone) to trigger invalidation. However, such a push would need to be uncacheable (e.g,. with `Cache-Control: no-store`) to assure that the error response wasn't returned; however, this falls afoul of HTTP/2's requirement that uncacheable responses not interact with the HTTP cache.

So, we've kind of built ourselves a nice trap here.

If invalidation is an important use case, we'll need to change one of these specifications, or invent a new protocol mechanism. Maybe a `CACHE_INVALIDATE` frame?

--
Mark Nottingham https://www.mnot.net/

Received on Wednesday, 24 August 2016 04:51:16 UTC