This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 25302 - Blob objects should have a keepalive list of objects
Summary: Blob objects should have a keepalive list of objects
Status: RESOLVED INVALID
Alias: None
Product: WebAppsWG
Classification: Unclassified
Component: File API (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Arun
QA Contact: public-webapps-bugzilla
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-09 19:41 UTC by Arun
Modified: 2014-05-07 03:09 UTC (History)
3 users (show)

See Also:


Attachments

Description Arun 2014-04-09 19:41:21 UTC
Bug 25081 Comment 6 suggests a keepalive list of objects that need a reference to a Blob to be active, whether or not it is closed.

The behavior might be:

1. That the keepalive list allows objects to add themselves to it, but only if the internal status marker of the Blob is OPENED.

2. That when CLOSED, the blob no longer lets objects add themselves to it, but does not purge objects already in it. The CLOSED blob sort of becomes a "ghost" -- only readable by objects (e.g. FormData) that need it, despite it being neutered -- but not accessible by any operation following the .close() call. 

3. Objects remove themselves from the Blob's keepalive list.

4. When all objects have removed themselves from it, it can be gc'd (at UA discretion).
Comment 1 Glenn Maynard 2014-04-09 20:29:41 UTC
When a fetch is started, it can effectively slice() the blob it was told to fetch, and close() its slice when it's done.  This can probably be part of the logic that prevents fetches from being affected by blob URLs being revoked.  In pseudocode:

fetch(resource):
    if resource is a blob URL:
        // Get the blob, to isolate us from the user revoking the blob URL.
        resource = getBlobFromUrl(resource)

    try:
        if resource is a blob:
            // We were given either a blob URL or a blob.  Slice the blob,
            // to isolate us from the user closing it.
            resource = resource.slice()
        // do the fetch ...
    finally:
        if resource is a blob:
            // Close the blob we sliced, so we don't prevent the underlying
            // storage from being reclaimed.
            resource.close()
Comment 2 Anne 2014-04-10 11:39:42 UTC
Glenn, this is not for fetch, this is for FormData.

Arun, your summary sounds good. I don't think you need to put actual constraints in the specification as to when objects can get added to that list, but it would be good to point them out in a note.
Comment 3 Glenn Maynard 2014-04-10 13:34:30 UTC
That doesn't matter.  You do this around any algorithm that reads a blob asynchronously, so the blob the async algorithm is reading is never one that scripts have the ability to close.
Comment 4 Anne 2014-04-10 14:20:45 UTC
I suppose we could create structured clones for FormData upon serialization. That's what fetch gets handed too. Not a bad idea.
Comment 5 Glenn Maynard 2014-04-10 14:39:17 UTC
I guess the "finally: close" portion doesn't actually need to be specced, since the effects aren't detectable by script.  That simplifies things a bit further.
Comment 6 Arun 2014-04-10 14:45:21 UTC
(In reply to Glenn Maynard from comment #1)
> When a fetch is started, it can effectively slice() the blob it was told to
> fetch, and close() its slice when it's done.  This can probably be part of
> the logic that prevents fetches from being affected by blob URLs being
> revoked.  In pseudocode:
> 
> fetch(resource):
>     if resource is a blob URL:
>         // Get the blob, to isolate us from the user revoking the blob URL.
>         resource = getBlobFromUrl(resource)
> 
>     try:
>         if resource is a blob:
>             // We were given either a blob URL or a blob.  Slice the blob,
>             // to isolate us from the user closing it.
>             resource = resource.slice()
>         // do the fetch ...
>     finally:
>         if resource is a blob:
>             // Close the blob we sliced, so we don't prevent the underlying
>             // storage from being reclaimed.
>             resource.close()


This seems like a good proposal. But now my questions are:

1.Should File API specify this? In bug 25081 Comment 7 I ask if this is an implementation detail, but now I think it should be spec'd as part of the FormData or Fetch or URL specs.

2. Should this idea supplant the idea of a keepalive list of objects that read from the Blob? Anne?
Comment 7 Glenn Maynard 2014-04-10 14:54:31 UTC
(In reply to Arun from comment #6)
> This seems like a good proposal. But now my questions are:
> 
> 1.Should File API specify this? In bug 25081 Comment 7 I ask if this is an
> implementation detail, but now I think it should be spec'd as part of the
> FormData or Fetch or URL specs.

It's not an implementation detail.  The real issue this solves is that the "closing blobs doesn't affect async operations" should be specified algorithmically, and not descriptively as it is now.

(Browsers don't need to actually call blob.slice(), or invoke structured clone as Anne suggests--probably a better choice--as long as the end result is the same.  They might just take a reference to the underlying storage, or something along those lines.  That's the implementation detail part.)
Comment 8 Anne 2014-04-10 14:59:43 UTC
Yeah, defining this as a structured clone is what http://url.spec.whatwg.org/ does today and should work for FormData with some hassle.

It's not an implementation detail. It's important.

It's still unclear from the specification how operations need to deal with neutered blobs...
Comment 9 Arun 2014-04-15 23:17:13 UTC
(In reply to Anne from comment #8)
> Yeah, defining this as a structured clone is what
> http://url.spec.whatwg.org/ does today and should work for FormData with
> some hassle.
> 
> It's not an implementation detail. It's important.
> 
> It's still unclear from the specification how operations need to deal with
> neutered blobs...


I think we're moving towards a Blob closure model where:

1. Objects requiring asynchronous access to Blobs work on structured clones of the Blob in question. This obviates the need for a keepalive list of objects. 

2. Read operation(s) return failure on closed Blobs, and not 0 bytes, along with a failure reason. Methods using read operation should handle the failure asynchronously, or throw if they are synchronously invoking the read operation (very rare -- synchronous reading of Blobs happen only on threads now). I've come around to agreeing that 0 bytes isn't the right "test" for a closed Blob, and is too ambiguous. There will be a property exposed on Blobs called isClosed, a boolean, in case in-script checks are needed (this is Bug 25343 -- please submit usecases for the property there).

3. Nothing else fails on a closed Blob. You can still generate a Blob URL through URL.createFor or URL.createObjectURL, but it will return a network error. So the FormData case works on serialization, but fails on actual read operation use.

Point 3. above may mean that structured cloning of a closed Blob actually works (contrary to http://www.whatwg.org/specs/web-apps/current-work/multipage/common-dom-interfaces.html#safe-passing-of-structured-data). 

In short, the *only* point of failure on a closed Blob is a read operation. Would this work?

The biggest "non spec" problem is determining a sufficient cue to GC a closed Blob, since if some operations work (but read doesn't), then they may still need to be around. But close still helps with memory management, because a read operation won't transfer bytes to memory on closure.
Comment 10 Glenn Maynard 2014-04-16 00:06:27 UTC
(In reply to Arun from comment #9)
> The biggest "non spec" problem is determining a sufficient cue to GC a
> closed Blob, since if some operations work (but read doesn't), then they may
> still need to be around. But close still helps with memory management,
> because a read operation won't transfer bytes to memory on closure.

It seems like GC should be unchanged.  Closing a blob may delete its data (whether it's on disk or in memory), but it doesn't reduce the lifetime of the Blob container itself.
Comment 11 Arun 2014-05-07 03:09:37 UTC
(We no longer need this, so marking it invalid).