This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 23479 - File Constructor and Blob Constructor should have same signature
Summary: File Constructor and Blob Constructor should have same signature
Status: RESOLVED FIXED
Alias: None
Product: WebAppsWG
Classification: Unclassified
Component: File API (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Arun
QA Contact: public-webapps-bugzilla
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-10-10 13:19 UTC by Arun
Modified: 2013-10-31 00:54 UTC (History)
4 users (show)

See Also:


Attachments

Description Arun 2013-10-10 13:19:02 UTC
File and Blob constructors currently have different signatures.  Feedback in off-list email was:

Shouldn't the File constructor have the same signature as the Blob
constructor, but also allow a 'name' parameter to be passed in the
dictionary. Possibly even throwing if no name is provided.

Having the Blob constructor and the File constructor be so different
when they are really doing the same thing seems surprising. And
forcing authors to create a temporary Blob just so that they can
create a File seems wasteful.

Please consider this implementation feedback.
Comment 1 Jonas Sicking (Not reading bugmail) 2013-10-10 19:05:44 UTC
Sorry, sent original email to the wrong email list.

The waste I'm referring to is both in terms of code size and in terms of memory.

The extra JS object isn't a big deal, but it will keep the contained data alive even if the File object is GCed or .close()ed.
Comment 2 Glenn Maynard 2013-10-11 00:53:19 UTC
Dictionary arguments are for optional parameters, so fileName doesn't belong in there unless you want to make it optional.  It does seem like it should be optional, if only because it's harmless and will make more sense if File gains other properties.

It's a little weird that you have to specify File.type by setting it on another Blob first, and the first thing I tried with File's ctor was new File(["foo"]), so it's definitely a bit odd.  I guess it's so you don't have to manually copy every property on the Blob if all you're doing is taking a Blob and attaching a filename to it: new File([blob], {type: blob.type}) is pretty awkward, especially as more properties are added to Blob.

You'd have the same problem if all you were doing was modifying a property, though, at least if there were any properties to replace.  For example, if it was possible to specify lastModifiedDate (it's odd that it's not), you'd have to specify every other property on the old File, eg. new File([oldFile], oldFile.name, {every property ..., lastModifiedDate: date}]).  That wouldn't be good.

I think there's something off about the constructor pattern here that won't really surface until Blob and/or File have one or two more properties, since you shouldn't need to carefully respecify every property to create a Blob or File with just one property changed.  I don't know off-hand what a good fix would be, though.

(FWIW, I think the File class shouldn't exist, and its properties should just be on Blob.  The separation doesn't help anything since there's no *functional* difference between the two, so it just makes things more complicated.)

(In reply to Jonas Sicking from comment #1)
> Sorry, sent original email to the wrong email list.
> 
> The waste I'm referring to is both in terms of code size and in terms of
> memory.
> 
> The extra JS object isn't a big deal, but it will keep the contained data
> alive even if the File object is GCed or .close()ed.

There shouldn't be a memory penalty.  If you do this:

var b = new Blob(["hello");
for(var i = 0; i < 1000000; ++i)
    b = new File(b, "hello.txt");

then the result (after GC) shouldn't take more memory than going through the loop just once.  Creating each new File should reference the underlying data, which isn't changing--this shouldn't result in a million nested File objects.  In other words, the resulting File should internally look like File(["hello"]), and not File([File([File([File([Blob(["hello"])])])])]).

Data staying alive after .close() is nothing new.  If you want to be able to close a blob, you explicitly need to close every copy you've ever made, since .close() explicitly doesn't apply to slices.  That probably makes .slice() very hard to use reliably, but that's a separate discussion...
Comment 3 Jonas Sicking (Not reading bugmail) 2013-10-11 16:49:08 UTC
"after GC" is key here. GC doesn't always collect all garbage. So for something like

var tempBlob = new Blob(...);
var file = new File(tempBlob, "name");
tempBlob = null;

the 'tempBlob' might get GCed after the 'file', hence retaining the data longer than needed.
Comment 4 Jonas Sicking (Not reading bugmail) 2013-10-11 16:54:14 UTC
There is of course also the problem of what happens if someone forgets to null out the tempBlob variable in the code above?

Put it another way: Requiring the creation of temporary objects that retain large amounts of data is a bad idea.
Comment 5 Glenn Maynard 2013-10-11 17:14:23 UTC
(In reply to Jonas Sicking from comment #3)
> "after GC" is key here. GC doesn't always collect all garbage. So for
> something like
> 
> var tempBlob = new Blob(...);
> var file = new File(tempBlob, "name");
> tempBlob = null;
> 
> the 'tempBlob' might get GCed after the 'file', hence retaining the data
> longer than needed.

This is an optimization, so the usual question applies: is this an actual problem?  If the data is taking so many resources that it matters if it's held a little longer, then it's a problem regardless of this, and the author needs to be using .close().

(In reply to Jonas Sicking from comment #4)
> There is of course also the problem of what happens if someone forgets to
> null out the tempBlob variable in the code above?
> 
> Put it another way: Requiring the creation of temporary objects that retain
> large amounts of data is a bad idea.

Blob is fundamentally designed around making large temporaries (eg. slice(128) to remove a header), so whack-a-moling a single case isn't a great idea.  (Anyway, as long as tempBlob is local, it'll go out of scope at the same time file does.)

I do think there's a simpler issue of the API being consistent; it's just a bit weird to have to create a temporary Blob here, and for some properties to be created when you create the File and some to be created when you create the Blob.  Something like this *might* make sense:

[Constructor(sequence<(ArrayBuffer or ArrayBufferView or Blob or DOMString)> blobParts, optional BlobPropertyBag options)],
Constructor(Blob fileBits, [EnsureUTF16] DOMString fileName)
]
interface File : Blob { }

To serve two different cases: creating a File from scratch, which you can now do in the same way you create a Blob; and taking a Blob and making a File out of it, which you can do the way it's specced today.

This feels messy, though, and like it would get messier as new Blob and File subclasses are added (eg. ZipFile).  This needs more thought...
Comment 6 Jonas Sicking (Not reading bugmail) 2013-10-11 17:48:30 UTC
I think optimizing "taking a Blob and making a File out of it" from

new File([blob], { type: blob.type, name: name });

to

new File(blob, name);


is premature. Lets keep things simple and consistent instead.
Comment 7 Glenn Maynard 2013-10-11 19:04:00 UTC
It's not an optimization, since it allows something you can't otherwise do: propagate all properties of the object, without knowing in advance what all of them are (since new properties may be added).  I'd be fine with what's in the spec currently being removed for now (so people don't implement it) until more thought is put into this, though.

new File([data], {name: "file.txt"}) is a good idea in any case.  "name" should be optional, like all dictionary arguments.  In order to avoid changing File itself, the default could be "" rather than null (that could be argued both ways).

(I do think the better solution is to merge File into Blob.)
Comment 8 Jonas Sicking (Not reading bugmail) 2013-10-12 00:14:26 UTC
One option would be to use the syntax:

new File([data], name, { options });

That way you could even wrap a named File around a Blob while forwarding all options by doing

myFile = new File([myBlob], "filename", myBlob);
Comment 9 Arun 2013-10-28 19:51:20 UTC
(In reply to Jonas Sicking from comment #8)
> One option would be to use the syntax:
> 
> new File([data], name, { options });
> 
> That way you could even wrap a named File around a Blob while forwarding all
> options by doing
> 
> myFile = new File([myBlob], "filename", myBlob);

I like this syntax the best, because I don't think name should be optional.  This seems like what the spec. should say.
Comment 10 Arun 2013-10-31 00:54:13 UTC
I'm marking this fixed along the lines of Comment 8, but you can additionally set lastModifiedDate.

http://dev.w3.org/2006/webapi/FileAPI/#dfn-file