This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 20677 - Unable to detect whether a file is in cache or not
Summary: Unable to detect whether a file is in cache or not
Status: RESOLVED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: HTML5 spec (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Robin Berjon
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-15 18:11 UTC by François REMY
Modified: 2016-04-25 21:07 UTC (History)
7 users (show)

See Also:


Attachments

Description François REMY 2013-01-15 18:11:01 UTC
Some offline web applications need a way to detect whether a file is in cache or not. Currently, it's not possible to do so without downloading the file. Here are a few use cases where this ability would be useful:

1. Using an already cached version of the file if it's available, download the best version otherwise.
	
	// if we need a low-resolution image
	if(deviceToPixelRatio <= 1) {
		
		// if the high resolution image has already been downloaded
		if(applicationCache.isCached("./images/abc-123.2x.png")) {
			// use the high resolution image straightaway
		} else {
			// use or download a standard resolution image (faster)
		}
		
	} else {
		
		// use or download the high-resolution image
		
	}
	
2. Using an index it was already downloaded

Let's suppose you have a list of scientific documents stored locally. The user types a query in to search the content of the documents. You want to help him find the results he want by expanding his query (ie: adding synonymous of the typed words as part of the query). You can do so by querying a third-party server that will give you the results based on a dictionary <http://syn.org/word/{WORD}>).

However, the user may want to be able to use the functionality offline and download the dictionary <http://syn.org/all.json> as part of an appcache (this may result from an action done in another app using the same service which you don't know). If he does so, you want your application to use this dictionary directly and not issue a request to the server. We assume the CORS settings of the syn.org website are done so that you can access all data you need.

	if(dict) {
		task.done(dict[word]);
		
	} else if(applicationCache.isCached("http://syn.org/all.json")) {
		var xhr = ...
		xhr.open("GET", "http://syn.org/all.json", true)
		xhr.onload = function() {
			dict = xhr.response;
			task.done(dict[word]);
		}
		...
		
	} else {
		var xhr = ...
		xhr.open("GET", "http://syn.org/word/"+encodeURIComponent(word), true)
		xhr.onload = function() {
			task.done(xhr.response);
		}
		
	}

3. Evaluating the time needed to make a feature available offline.

Imagine a web application similar to Microsoft Excel. By default, the appcache will include the most basic features & function, and a bit of formatting (the rest is being done in the cloud). However, if an user want to edit graphics offline, it needs to download an API file for that. When the user if offline, you may want to know if the graphic library is available (=cached) or not in order to put the "insert graph" button in its disabled state otherwise.

Now, the user return online because he really needed the graph library. The user want to see how many content will need to be downloaded if he want to "install" the graphing library for offline use the next time. If the Graphing library depends on some APIs used by more components, and that you want to say to user how many files will need to be dowloaded in order to make the functionnality available offline, you need to know how many of those components are already cached.

You can either try to persist this yourself or rely on applicationCache.isCached(...) to get accurate results.
Comment 1 Ian 'Hixie' Hickson 2013-01-15 19:28:39 UTC
What I'm not following is how you're getting the file into the cache in the first place to have the question of whether or not it's there to answer.

BTW, this bug is filed on the HTML working group; if you want the feature added to the WHATWG spec (which is the one I edit) then please file a new bug in, or move this bug to, the WHATWG product's HTML component. The form at the following address has all the fields filled in accordingly: 
   http://whatwg.org/newbug
Comment 2 François REMY 2013-01-15 20:16:10 UTC
OK, I'm going to create a duplicate. It would be easier if we could report a bug to more than one component at the same time, but I didn't see this option anywhere. Annoying if you want my opinion.

Now, let's answer to your question. 

In the first case, the image could have been downloaded using <image src="abc-123.png" srcset="2x abc-123.2x.png" /> if the browser was first launched on a 'retina' screen. You have no control on whether the browser will choose the first or the second one, but if later on you want to use the exact same image on the same device on a canvas, you want to use whatever version was already downloaded.

In the second case, the dictionary may have been cached because the user accepted to use an appcache for the synonymous webservice (which involves caching the dict). However, you don't want to use the file if it was not cached because it's probably way to huge to be used for just one lookup.

In the last case, I already explained why some files may be in cache: the appcache is minimalist by default but may be growing at the user request to include more components. As you want to add more components in the appcache, you may want to know which files are already cached. Some files may also have been downloaded without being in the appcache on per-use basis. With a bit of luck, they may still been in the cache and will not need to be reused.
Comment 3 Ian 'Hixie' Hickson 2013-01-15 20:40:28 UTC
(In reply to comment #2)
> 
> In the first case, the image could have been downloaded using <image
> src="abc-123.png" srcset="2x abc-123.2x.png" /> if the browser was first
> launched on a 'retina' screen. You have no control on whether the browser
> will choose the first or the second one, but if later on you want to use the
> exact same image on the same device on a canvas, you want to use whatever
> version was already downloaded.

How would that file get cached if it's not in the manifest?


> In the second case, the dictionary may have been cached because the user
> accepted to use an appcache for the synonymous webservice (which involves
> caching the dict). However, you don't want to use the file if it was not
> cached because it's probably way to huge to be used for just one lookup.

Oh this is not for testing if the file is in the _current_ appcache but for testing if the resource is in _any_ appcache? Interesting.

That would have some pretty serious freshness complications... (what if the cache we have the file in is a year old? Should we still think we have it for the purposes of this API? What if it's more up to date than the things in the current appcache?)


> In the last case, I already explained why some files may be in cache: the
> appcache is minimalist by default but may be growing at the user request to
> include more components.

Only if you use the fallback feature and visit the file directly, right? Or do you mean some other way?
Comment 4 François REMY 2013-01-15 21:33:04 UTC
(In reply to comment #3)
> How would that file get cached if it's not in the manifest?

Browser cache? A file may end up in the browser cache just by "viewing" it, even if that "cached" state is not persisted. The browser may use some heuristic to say whether the cached file is likely up to date or not.

Now that I think about it, maybe it would be great to have a second parameter that express how old the cached version may be at max to be considered valid.
 

> Oh this is not for testing if the file is in the _current_ appcache but for
> testing if the resource is in _any_ appcache? Interesting.
> 
> That would have some pretty serious freshness complications... (what if the
> cache we have the file in is a year old? Should we still think we have it
> for the purposes of this API? What if it's more up to date than the things
> in the current appcache?)

On this one, I'll be the "user" and somebody else will do the brain :-) I guess we should have a look at how browser cache works and try to replicate this behavior. 

Maybe we can make the function asynchronous and it could issue a HEAD ... If-Modified-Since request to see if the server returns a 304 response. It's just an idea, maybe it would just be better to return false in case of old cache. Or maybe this could also be a parameter. We should have a look at the use cases to find out... 

Because anything related to IO is async anyway, transforming this in an async function doesn't bother me too much.
Comment 5 Ian 'Hixie' Hickson 2013-01-15 23:01:07 UTC
Oh this is about the regular HTTP cache! I thought you mean the app cache! So sorry, I was completely confused.
Comment 6 François REMY 2013-01-15 23:25:12 UTC
(In reply to comment #5)
> Oh this is about the regular HTTP cache! I thought you mean the app cache!
> So sorry, I was completely confused.

Is there a strong difference between the two? I mean, the appcache is just an override over the 'native' browser cache, right? It's not clear in the spec which tecnologies it uses but it seems to be it's based on the same principles, right?
Comment 7 Ian 'Hixie' Hickson 2013-01-15 23:56:31 UTC
The appcache system is a way to create a new cache that is bound to the current browsing context, with a manifest that is atomically downloaded and its files cached, and that then cuts off the page from the network, using only the cache.

So yes, it's quite different. :-)
Comment 8 François REMY 2013-01-16 10:34:23 UTC
(In reply to comment #7)
> The appcache system is a way to create a new cache that is bound to the
> current browsing context, with a manifest that is atomically downloaded and
> its files cached, and that then cuts off the page from the network, using
> only the cache.

Then, yes, we speak about the 'classical' browser cache (even if a file is in the appcache of the _current_ application, it should return that it's cached too because it won't be loaded over the network).

I would also expect the appcache of other apps to be used in a 'conditional' way. If 'http://syn.org/all.json' is appcached by 'syn.org' then the browser should issue an HTTP HEAD If-Modified-Since request to see if it's still up-to-date and call back whether the data is still valid or not.

The goal of this API is really to work over NETWORK resources to know if you can use them without triggering their download.
Comment 9 Robin Berjon 2013-01-21 16:00:46 UTC
Mass move to "HTML WG"
Comment 10 Robin Berjon 2013-01-21 16:03:23 UTC
Mass move to "HTML WG"
Comment 11 Travis Leithead [MSFT] 2016-04-25 21:07:48 UTC
HTML5.1 Bugzilla Bug Triage: I think this use case is now pretty well handled by Service Workers :)

If this resolution is not satisfactory, please copy the relevant bug details/proposal into a new issue at the W3C HTML5 Issue tracker: https://github.com/w3c/html/issues/new where it will be re-triaged. Thanks!