Bug 16790 - The term microtask should be defined
The term microtask should be defined
Status: RESOLVED WONTFIX
Product: HTML WG
Classification: Unclassified
Component: HTML5 spec
unspecified
PC All
: P2 normal
: ---
Assigned To: Erika Doyle Navara
HTML WG Bugzilla archive list
:
Depends on:
Blocks: 17758
  Show dependency treegraph
 
Reported: 2012-04-18 19:20 UTC by Arun
Modified: 2012-10-16 15:18 UTC (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Arun 2012-04-18 19:20:06 UTC
Currently in the HTML specification, a definition and algorithmic description of how to perform a microtask checkpoint exists.  Additionally, we should define what exactly constitutes a microtask.  

This could in turn be used in other specifications, such as the File API, which could use the concept of a microtask to describe when to revoke an object URL (Blob URL), and in IndexedDB.
Comment 1 Arun 2012-04-18 19:25:27 UTC
In #whatwg on Freenode, the following was suggested:

smaug____: arunranga: yeah, HTML spec could perhaps specify better what is a microtask. (The outermost script execution of the innermost task)

The context of this discussion can be found at: 

http://krijnhoetmer.nl/irc-logs/whatwg/20120418#l-801
Comment 2 Ian 'Hixie' Hickson 2012-04-18 19:30:50 UTC
I don't understand what this would be useful for. Why would we do anything but mutation observers in a microtask?

If we have to do anything else, which doesn't seem clear to me, then we just need to add it to the "perform a microtask checkpoint" algorithm, either explicitly or via a hook.
Comment 3 Anne 2012-04-18 19:43:21 UTC
This is basically about defining when dereferencing certain URLs starts to fail. See http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/1306.html in particular for the relation to microtasks.

See also http://lists.w3.org/Archives/Public/public-webapps/2012AprJun/0085.html and http://lists.w3.org/Archives/Public/public-webapps/2012AprJun/0092.html

Glenn can probably explain more.
Comment 4 Glenn Maynard 2012-04-18 19:55:04 UTC
Actually, I don't think microtasks is quite what auto-revoking URLs need.  Instead, they should use stable states.  Start an asynchronous task, await a stable state, and then revoke the URL.  I think this is a better fit for that feature, because for example they happen between synchronously-executed <script>, where microtasks don't.

That is, in

<script>url = URL.createObjectURL(blob, {autoRevoke:true})</script><script>performXHR(url)</script>

the XHR should fail, with the URL being revoked after the first script exits and before the second one starts.  With microtasks, that may not happen.

I could be wrong; I'm working from memory, and there are probably subtle side-effects of either approach which I'm missing.  The discussion got sidetracked, so I don't know if anyone has fully reviewed this approach.  http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/1316.html
Comment 5 Ian 'Hixie' Hickson 2012-04-18 19:57:40 UTC
I don't think it makes sense for URLs to ever start fail while the page that created them is still up. Lots of things on the platform use URLs in a lazy fashion. For example, <img src=""> doesn't necessarily load the image until the user asks the browser to do so. All kinds of stuff in the HTML spec in fact does fetching asynchronously, typically much later than the end of the current task, let alone microtask.
Comment 6 Glenn Maynard 2012-04-18 20:08:57 UTC
(In reply to comment #5)
> I don't think it makes sense for URLs to ever start fail while the page that
> created them is still up.

This is arguing that URL.revokeObjectURL should be removed (or made into a no-op).

This would mean that creating a blob URL would cause the blob's storage to be permanently unreleasable for the lifetime of the page.  Blobs tend to be used for nontrivially large objects, so for any page that uses blob URLs in anything but a singleton fashion, this would cause significant memory leaks.

This would make blob URLs essentially unusable.

> Lots of things on the platform use URLs in a lazy
> fashion. For example, <img src=""> doesn't necessarily load the image until the
> user asks the browser to do so. All kinds of stuff in the HTML spec in fact
> does fetching asynchronously, typically much later than the end of the current
> task, let alone microtask.

This has been discussed on the list: each API that performs fetches asynchronously will need to take a reference to the underlying blob data.

For example, upon assigning a blob URL to img.src, HTMLImageElement would need to synchronously store the underlying data associated with the blob.  It can then perform the fetch whenever it wants, without being affected by the URL being freed, or the Blob being neutered due to transfers or Blob.close.
Comment 7 Ian 'Hixie' Hickson 2012-04-18 21:30:46 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > I don't think it makes sense for URLs to ever start fail while the page that
> > created them is still up.
> 
> This is arguing that URL.revokeObjectURL should be removed (or made into a
> no-op).

If the author knows that the URL isn't being used anymore, I've nothing against the author revoking it. But the browser doesn't know, so the browser shouldn't do it automatically.


> This would mean that creating a blob URL would cause the blob's storage to be
> permanently unreleasable for the lifetime of the page.  Blobs tend to be used
> for nontrivially large objects, so for any page that uses blob URLs in anything
> but a singleton fashion, this would cause significant memory leaks.

Paging the data to disk is sufficient, and cheap.


> > Lots of things on the platform use URLs in a lazy
> > fashion. For example, <img src=""> doesn't necessarily load the image until the
> > user asks the browser to do so. All kinds of stuff in the HTML spec in fact
> > does fetching asynchronously, typically much later than the end of the current
> > task, let alone microtask.
> 
> This has been discussed on the list: each API that performs fetches
> asynchronously will need to take a reference to the underlying blob data.
> 
> For example, upon assigning a blob URL to img.src, HTMLImageElement would need
> to synchronously store the underlying data associated with the blob.  It can
> then perform the fetch whenever it wants, without being affected by the URL
> being freed, or the Blob being neutered due to transfers or Blob.close.

Are we going to be specifying all the different places that can happen? I guess if we specify it in detail that wouldn't be so bad.
Comment 8 Glenn Maynard 2012-04-18 21:44:42 UTC
(In reply to comment #7)
> If the author knows that the URL isn't being used anymore, I've nothing against
> the author revoking it. But the browser doesn't know, so the browser shouldn't
> do it automatically.

The browser does know, because the user has requested it explicitly:

var url = URL.createObjectUrl(myBlob, {autoRevoke: true});
img.src = url;

This avoids the glaring problem with object URLs, by making it impossible to accidentally leak myBlob.  (It also avoids the problems with previous "revoke on first use" proposals, by eliminating any possible dependencies on task order across task queues.)

> Paging the data to disk is sufficient, and cheap.

You're arguing that never freeing memory, and letting everything page to disk forever, is an acceptable memory management design?  That's crazy.  If I have an image viewer that receives high-resolution PNGs as blobs (for example, for examining or editing print-resolution images), and then hands them off to HTMLImageElement.src as object URLs to be viewed (or loaded into WebGL), that will quickly explode into gigabytes of data that has to be written to disk and then kept around for the lifetime of the page.

Also, my phone doesn't have a disk drive, and internal storage is very limited.

> Are we going to be specifying all the different places that can happen? I guess
> if we specify it in detail that wouldn't be so bad.

Hopefully it can be done in a way that minimizes the amount of work high-level specs like img.src and XHR need to do.
Comment 9 Arun 2012-04-18 22:05:51 UTC
Past discussion on the matter of oneTimeOnly or autoRevoke may have used "microtask" and "stable state" interchangeably (vis http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/1306.html).  "Microtask" as an isolated concept doesn't exist; it is merely a magic word for algorithm invocation.

This bug was spawned with the idea of defining microtask more rigorously, BUT mainly as a solution to the autoRevoke/oneTimeOnly conundrum.  IF stable state is a better solution for that, then I'm not sure we need to define microtask; it remains a magic word for mutation observers.  

Unless anyone disagrees with Comment 4 or finds reason why this bug is still useful, I'm happy to move on.
Comment 10 Glenn Maynard 2012-04-18 22:19:44 UTC
(In reply to comment #9)
> Past discussion on the matter of oneTimeOnly or autoRevoke may have used
> "microtask" and "stable state" interchangeably (vis
> http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/1306.html). 
> "Microtask" as an isolated concept doesn't exist; it is merely a magic word for
> algorithm invocation.

It refers to "perform a microtask checkpoint": http://www.whatwg.org/specs/web-apps/current-work/#perform-a-microtask-checkpoint

The idea was to perform the revoke at the next microtask checkpoint (but wasn't in those terms since I hadn't seen that definition yet back then).

> Unless anyone disagrees with Comment 4 or finds reason why this bug is still
> useful, I'm happy to move on.

(It does need review by somebody more familiar with the bigger picture, but we can always come back to this if it turns out that microtasks really are what it needs.)
Comment 11 Arun 2012-07-13 21:18:20 UTC
It would seem that this is prudent to reopen, following discussion here:

https://bugzilla.mozilla.org/show_bug.cgi?id=773132
Comment 12 Arun 2012-07-13 21:24:36 UTC
Can we maybe cement what a microtask is in HTML5, perhaps along the lines of what https://bugzilla.mozilla.org/show_bug.cgi?id=773132#c10 suggests, namely:

"Microtask is the outermost script execution.
MutationObserver callbacks get called at the end of the microtask or end of task, whichever is first."
Comment 13 Glenn Maynard 2012-07-13 21:56:02 UTC
(redirecting from https://bugzilla.mozilla.org/show_bug.cgi?id=773132)

> Script (event listener etc) can't know what all scripts will run before the next stable state.
> Microtask doesn't really have that problem.

I was thinking that with stable states,

<script>url = URL.createObjectURL(blob, {autoRevoke: true});</script>
<script>/* use url */</script>

would free the URL before the second script runs.  Actually, the stable state happens when </script> is parsed, before the script is run.  (That might give the same effect, since that's still before the second script runs, but it might not work in all cases.)

Stable states are definitely wrong for:

url = createObjectURL(blob, {autoRevoke: true});
document.write("<script>console.log('y');</scr" + "ipt>");

where url should *not* be revoked by "</script>" being parsed.

Microtask checkpoints seem nondeterministic for this, though.  If the whole block is parsed in a single invocation of the parser, then there's no checkpoint until the parser returns, so the URL isn't revoked for the second script.  If there's a delay due to incremental parsing and we return to the event loop, the event loop gives us a checkpoint, so the URL *is* revoked between scripts.

Maybe this just needs its own concept: something that's always invoked when exiting the outermost script, including from parser-invoked scripts.
Comment 14 contributor 2012-07-18 07:29:40 UTC
This bug was cloned to create bug 17988 as part of operation convergence.
Comment 15 Arun 2012-07-18 16:26:50 UTC
(In reply to comment #14)
> This bug was cloned to create bug 17988 as part of operation convergence.

I'm really confused.  Is "operation convergence" documented somewhere?(In reply to comment #14), or is it merely the "new old name" for the endeavor to unify the W3C HTML specification with the live specification at WHATWG?

I'm confused why it was necessary to "clone" it, but so be it.  My ideal end goal is to have something along the lines of Comment 12 in the HTML specification, so that it can be used as a concept in various APIs, notably the File API's autoRevoke behavior.
Comment 16 Edward O'Connor 2012-08-29 19:51:13 UTC
Arun, the point of the clone is so that we have a copy of the bug in each component (the HTML WG's HTML5 spec component and the WHATWG's HTML component).

I'm moving this one back to the HTMLWG component accordingly.
Comment 17 Arun 2012-10-16 15:18:26 UTC
See Bug 19554 for a successor conversation.