14364 – appcache: Add an API to make appcache support caching specific URLs dynamically

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 14364 - appcache: Add an API to make appcache support caching specific URLs dynamically

Summary: appcache: Add an API to make appcache support caching specific URLs dynamically

Status:	RESOLVED WONTFIX

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	HTML5 spec (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	P2 enhancement
Target Milestone:	---
Assignee:	Robin Berjon
QA Contact:	HTML WG Bugzilla archive list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2011-10-03 14:00 UTC by Louis-R
Modified:	2013-04-24 16:20 UTC (History)
CC List:	14 users (show)

See Also:

Attachments

Description Louis-R 2011-10-03 14:00:00 UTC

A simple way to make the appcache dynamic would be to allow data-uris as manifests, to allow scripts to require new ressources to be cached, without server round-trips.

This is of course not an ideal solution to make the appcache dynamic, but it is one easy to implement and to get out of the door quickly.

Comment 1 Ian 'Hixie' Hickson 2011-10-03 22:35:53 UTC

We're not going to add sub-optimal solutions just so we can get something out one year earlier, when the Web is going to last decades. :-)

What we need here is a clear understanding of the use cases and requirements. What are the cases where you're wishing you could add URLs to the appcache dynamically?

Comment 2 Philipp Hagemeister 2011-10-07 12:35:38 UTC

Wouldn't that allow anyone to hijack a website forever?

1. Attacker temporarily gains control over the content of http://example.com/ , and writes

<html manifest="data:text/cache-manifest;base64,Q0FDSEUgTUFOSUZFU1QK">
example.com defaced!
</html>

2. User visits http://example.com/, puts the page in appcache.

3. Rightful owner of example.com regains control (or domain ownership changes if the domain was hijacked, ...).

4. User visits http://example.com/, still sees defacement.

How can the rightful owner of example.com ever serve the user anything?


On the other hand, locking the content (and scripts) of a website forever could also provide benefits to a carefully-engineered project. JavaScript on the page could somehow download the new version, cryptographically verify it (beyond SSL, which may be compromised by .gov actors, like google.com in Iran recently), and only then update to the new version.

Comment 3 Ian 'Hixie' Hickson 2011-10-21 22:43:39 UTC

Yeah we're definitely not using data: for this.

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Did Not Understand Request
Change Description: no spec change
Rationale: What are the use cases for making appcache dynamic? (I'm not saying there aren't any, I just need to know what they are to design the solution for them.)

Comment 4 Louis-R 2011-10-24 19:52:37 UTC

(In reply to comment #3)
> Yeah we're definitely not using data: for this.
> 
> EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
> satisfied with this response, please change the state of this bug to CLOSED. If
> you have additional information and would like the editor to reconsider, please
> reopen this bug. If you would like to escalate the issue to the full HTML
> Working Group, please add the TrackerRequest keyword to this bug, and suggest
> title and text for the tracker issue; or you may create a tracker issue
> yourself, if you are able to do so. For more details, see this document:
>    http://dev.w3.org/html5/decision-policy/decision-policy.html
> 
> Status: Did Not Understand Request
> Change Description: no spec change
> Rationale: What are the use cases for making appcache dynamic? (I'm not saying
> there aren't any, I just need to know what they are to design the solution for
> them.)

Granted, using data isn't the best option.

I've written an extensive blog post about the use cases for a dynamic appcache: http://www.louisremi.com/2011/10/07/offline-web-applications-were-not-there-yet/

tl;dr: if you build an rss reader with checkbox to make articles available offline, it's easy to store/delete the text content of the article at will using localStorage or indexedDb, but it's impossible to store/delete associated images (and sounds/videos). You could dynamically generate a cache manifest for all "offline enabled" articles, but the client would have to re-download all resources every-time the manifest is updated, as you know. (and you can't store images as data-uris, since they come from different origins)

Mozilla implemented a simple "OfflineResourceList" API which solves that problem by enhancing applicationCache with "add()" and "remove()" methods.
This is the kind of solution I am looking for, although "add" is a confusing name, since it should be able to update a particular resource too.

There is a risk that this API could cause confusion amongst web developers. Should they use a cache manifest or abandon it completely in favor of the JS API? I believe the cache manifest should be advocated to be used for the application structure+presentation+logic (HTML, CSS, JS), while the dynamic API should be used for the application *content* (medias, xml, json).

Comment 5 Ian 'Hixie' Hickson 2011-10-25 02:26:46 UTC

Thanks, will investigate.

Comment 6 Ian 'Hixie' Hickson 2011-10-27 00:15:14 UTC

So the problem is that you write an application that, while online, downloads a bunch of data from the server, and this data includes references to cross-origin images, and you want to make sure that those immediately get cached too, so that when the user later goes offline and tries to use that data, the browser won't otherwise be able to show the images?

You can work around that today using the FALLBACK section, no? (List the foreign image sites as fallback namespaces that fall back to a "broken image" icon, say, and then when you fetch all the data from your server, quickly also create <img> elements for all those foreign images. They'll then be cached.)

Still, I could see how that wouldn't be satisfactory. So for this use case, we'd need an API to add a URL to the cache manually, an API to remove a URL from the cache manually, and an API to list all the files that have been added manually? That seems easy enough to support.

Comment 7 Ian 'Hixie' Hickson 2011-11-03 16:03:26 UTC

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Partially Accepted
Change Description: none yet
Rationale: The use case described in comment 6 seems reasonable. I have marked this LATER so that we can look add this once browsers have caught up with what we've specified so far.

Comment 8 Simon Pieters 2011-11-04 06:16:57 UTC

(In reply to comment #7)
> I have marked
> this LATER so that we can look add this once browsers have caught up with what
> we've specified so far.

I believe this has already happened.

Comment 9 Ian 'Hixie' Hickson 2011-11-04 17:08:04 UTC

I didn't mean just with appcache.

Do I take it from your comment that there is implementation interest in adding this now?

Comment 10 Anne 2011-11-15 12:18:52 UTC

It seems both developers and implementors want this, yes.

Comment 11 michaeln 2011-11-15 22:48:10 UTC

I think this request makes sense but is not the most pressing issue to resolve, this would be of great convenience. 

But tweeking the model for loading pages from, and associating pages with, and updating caches such that it works for wider variety of use cases is more of a priority (imo). I'd like to see that get in better shape prior to mixing in support for ad-hoc resources.

Comment 12 Ian 'Hixie' Hickson 2012-05-03 18:12:24 UTC

An idea I was kicking around would be to instead have just a way to declare a JS file as being a local interceptor, and then have that JS file be automatically launched in a worker thread, and then every network request gets proxied through that worker in some well-defined manner. The worker could then either say "do whatever you would normally do for that URL", or "redirect to this URL and try again", or "here's the data for that URL".

That would allow authors to implement the above add/remove functionality themselves just by pushing the data into a blob store (FIlesystem API, Index DB), which would be just a few lines of code, while also allowing much more flexible approaches.

Any opinions?

Comment 13 Philipp Hagemeister 2012-05-03 21:05:53 UTC

The JavaScript redirector sounds fantastic, but it sounds complicated to implement in the current state.

Wouldn't it be way simpler to just load a defined fallback HTML document? For example, given the following appcache:

CACHE MANIFEST
ALIAS:
/x.html /serve-file.html
/files/* /serve-file.html
# serve-file.html is automatically included in the appcache

The request to /files/test.html would just render serve-file.html, but under the original (window.)location (just like FALLBACK does). In fact, ALIAS would be exactly like a FALLBACK entry that always fails to load. Additionally, the * placeholder would allow marking whole multiple URLs as belonging to the manifest.

On review, this seems very easy to implement, both for user agent and web application authors.

As a downside, it doesn't allow embedding of non-HTML resources like images. It does allow downloads via window.location.replace(dataUri). To me, that doesn't like a big deal since any dynamically generated page should be using data URIs for dynamically generated images/scripts/styles in the first place.

Comment 14 Ian 'Hixie' Hickson 2012-05-04 18:10:01 UTC

The idea would be to render pages, images, etc from data in IndexDB, not to just to hardcode aliases. (This is in the context of wanting to add and remove URLs from the appcache, which would be easily implementable using a worker as described above.)

Comment 15 michaeln 2012-05-04 22:54:11 UTC

> Wouldn't it be way simpler to just load a defined fallback HTML document? For
> example, given the following appcache:
> 
> CACHE MANIFEST
> ALIAS:
> /x.html /serve-file.html
> /files/* /serve-file.html
> # serve-file.html is automatically included in the appcache

Chromium's appcache actually has a feature that's very close to whats described here, with a slightly different syntax. The url in the first column is considered a namespace prefix just like entries in the FALLBACK section.

CHROMIUM-INTERCEPT:
/Bugs/Public/show_bug.cgi?id= return /Bugs/Public/bug_shower_page.html

http://code.google.com/p/chromium/issues/detail?id=101565
http://codereview.chromium.org/8396013/

I dont think this addresses what this particular w3c issue is about.

Comment 16 contributor 2012-07-18 07:26:48 UTC

This bug was cloned to create bug 17974 as part of operation convergence.

Comment 17 Robin Berjon 2013-01-21 16:00:22 UTC

Mass move to "HTML WG"

Comment 18 Robin Berjon 2013-01-21 16:03:06 UTC

Mass move to "HTML WG"

Comment 19 Robin Berjon 2013-04-24 16:20:59 UTC

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the Editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the Tracker Issue; or you may create a Tracker Issue
yourself, if you are able to do so. For more details, see this document:

   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: none
Rationale: This will be addressed by NavCon.