Steve Souders Offline Web Apps Workshop Position Paper

Author: Steve Souders

SUMMARY: App cache is complicated and frequently produces an unexpected user experience. It's also being (ab)used as a workaround for the fact that the browser's cache does not cache in an effective way - this is just an arms race for finite resources.

DETAILS: I've spoken at many mobile-specific conferences and meetups in the last few months. When I explain the way app cache actually works, developers come up afterward and say "now I finally understand what was happening with my offline app." These are the leading mobile developers in the world - and they don't get it.

John Allsopp does a good job of outlining the gotcha, and I've added some (slides 50&51):

HTML responses with the MANIFEST attribute are stored in app cache by default, even if they're not in the CACHE: section of the manifest file.

If a CACHE: resource 404s then none of the resources are cached.

The manifest file must be changed in order for changed CACHE: resources to be updated.

Modified CACHE: resources aren't seen by the user until the second time they load the app - even if they're online.

It's easy to point out problems - you folks have the more difficult job of finding solutions. But I'll make a few suggestions:

If the user is online, always do a conditional GET (If-Modified-Since or If-None-Match) request for the manifest file - Having to modify or touch the manifest file when an image or script changes is easy for developers to forget.

Use updated resources on first load - The developer needs a way to say "if the user is online, then fetch (some/all) of the CACHE: resources that have changed before rendering the app". I would vote to make this the default behavior, and provide a way to toggle it (in the manifest file or HTML attribute). Perhaps this should also be done at the individual resource level - "I want updated scripts to block the initial rendering, but nothing else". The manifest file could have an indicator of which resources to check & download before doing the initial rendering.

404s - I haven't tested this myself, but it seems like overkill. Every response in the CACHE: section should be cached, independent of the other responses. Perhaps this is browser-specific?

updateReady flag - It's great that developers can use the updateReady event to prompt the user to reload the app if any CACHE: resources have changed underneath them, but the bar is too high. In addition, have a flag that indicates that the browser should prompt the user automatically if any CACHE: resources were updated.

Finally, on the topic of arms race, I know many websites that are using app cache as a way to store images, scripts, and stylesheets. Why? It's because the browser's disk cache (for all browsers AFAIK) is poorly implemented. App cache provides a dedicated amount of space for a specific website (as opposed to a common shared space). App cache allows for prioritization - if I have 10M of resources I can put the scripts in the CACHE: section so they don't get purged at the expense of less painful images.

Certainly a better solution would be for the browsers to have improved the behavior of disk cache 5 years ago. But given where we are, an increasing number of websites are consuming the user's disk space. In most cases the user doesn't have a way or doesn't know how to clear app cache. I just tried to clear my app cache in Chrome & Firefox without success - chrome://appcache-internals continues to list the resources even after clearing cache. This is amazing - even I, as a user, can't figure out how to clear my app cache for other websites!

Obviously, better user control over app cache is needed. I suggest that clearing "data" clears both the disk cache as well as app cache. Alternatively, we extend the browser UI to have an obvious "clear app cache" entry.