17988 – The term microtask should be defined

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 17988 - The term microtask should be defined

Summary: The term microtask should be defined

Status:	RESOLVED WONTFIX

Alias:	None

Product:	WHATWG
Classification:	Unclassified
Component:	HTML (show other bugs)
Version:	unspecified
Hardware:	Other other

Importance:	P3 normal
Target Milestone:	Unsorted
Assignee:	Ian 'Hixie' Hickson
QA Contact:	contributor

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2012-07-18 07:29 UTC by contributor
Modified:	2012-10-16 15:15 UTC (History)
CC List:	5 users (show)

See Also:

Attachments

Description contributor 2012-07-18 07:29:36 UTC

This was was cloned from bug 16790 as part of operation convergence.
Originally filed: 2012-04-18 19:20:00 +0000
Original reporter: Arun <arun@mozilla.com>

================================================================================
 #0   Arun                                            2012-04-18 19:20:06 +0000 
--------------------------------------------------------------------------------
Currently in the HTML specification, a definition and algorithmic description of how to perform a microtask checkpoint exists.  Additionally, we should define what exactly constitutes a microtask.  

This could in turn be used in other specifications, such as the File API, which could use the concept of a microtask to describe when to revoke an object URL (Blob URL), and in IndexedDB.
================================================================================
 #1   Arun                                            2012-04-18 19:25:27 +0000 
--------------------------------------------------------------------------------
In #whatwg on Freenode, the following was suggested:

smaug____: arunranga: yeah, HTML spec could perhaps specify better what is a microtask. (The outermost script execution of the innermost task)

The context of this discussion can be found at: 

http://krijnhoetmer.nl/irc-logs/whatwg/20120418#l-801
================================================================================
 #2   Ian 'Hixie' Hickson                             2012-04-18 19:30:50 +0000 
--------------------------------------------------------------------------------
I don't understand what this would be useful for. Why would we do anything but mutation observers in a microtask?

If we have to do anything else, which doesn't seem clear to me, then we just need to add it to the "perform a microtask checkpoint" algorithm, either explicitly or via a hook.
================================================================================
 #3   Anne                                            2012-04-18 19:43:21 +0000 
--------------------------------------------------------------------------------
This is basically about defining when dereferencing certain URLs starts to fail. See http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/1306.html in particular for the relation to microtasks.

See also http://lists.w3.org/Archives/Public/public-webapps/2012AprJun/0085.html and http://lists.w3.org/Archives/Public/public-webapps/2012AprJun/0092.html

Glenn can probably explain more.
================================================================================
 #4   Glenn Maynard                                   2012-04-18 19:55:04 +0000 
--------------------------------------------------------------------------------
Actually, I don't think microtasks is quite what auto-revoking URLs need.  Instead, they should use stable states.  Start an asynchronous task, await a stable state, and then revoke the URL.  I think this is a better fit for that feature, because for example they happen between synchronously-executed <script>, where microtasks don't.

That is, in

<script>url = URL.createObjectURL(blob, {autoRevoke:true})</script><script>performXHR(url)</script>

the XHR should fail, with the URL being revoked after the first script exits and before the second one starts.  With microtasks, that may not happen.

I could be wrong; I'm working from memory, and there are probably subtle side-effects of either approach which I'm missing.  The discussion got sidetracked, so I don't know if anyone has fully reviewed this approach.  http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/1316.html
================================================================================
 #5   Ian 'Hixie' Hickson                             2012-04-18 19:57:40 +0000 
--------------------------------------------------------------------------------
I don't think it makes sense for URLs to ever start fail while the page that created them is still up. Lots of things on the platform use URLs in a lazy fashion. For example, <img src=""> doesn't necessarily load the image until the user asks the browser to do so. All kinds of stuff in the HTML spec in fact does fetching asynchronously, typically much later than the end of the current task, let alone microtask.
================================================================================
 #6   Glenn Maynard                                   2012-04-18 20:08:57 +0000 
--------------------------------------------------------------------------------
(In reply to comment #5)
> I don't think it makes sense for URLs to ever start fail while the page that
> created them is still up.

This is arguing that URL.revokeObjectURL should be removed (or made into a no-op).

This would mean that creating a blob URL would cause the blob's storage to be permanently unreleasable for the lifetime of the page.  Blobs tend to be used for nontrivially large objects, so for any page that uses blob URLs in anything but a singleton fashion, this would cause significant memory leaks.

This would make blob URLs essentially unusable.

> Lots of things on the platform use URLs in a lazy
> fashion. For example, <img src=""> doesn't necessarily load the image until the
> user asks the browser to do so. All kinds of stuff in the HTML spec in fact
> does fetching asynchronously, typically much later than the end of the current
> task, let alone microtask.

This has been discussed on the list: each API that performs fetches asynchronously will need to take a reference to the underlying blob data.

For example, upon assigning a blob URL to img.src, HTMLImageElement would need to synchronously store the underlying data associated with the blob.  It can then perform the fetch whenever it wants, without being affected by the URL being freed, or the Blob being neutered due to transfers or Blob.close.
================================================================================
 #7   Ian 'Hixie' Hickson                             2012-04-18 21:30:46 +0000 
--------------------------------------------------------------------------------
(In reply to comment #6)
> (In reply to comment #5)
> > I don't think it makes sense for URLs to ever start fail while the page that
> > created them is still up.
> 
> This is arguing that URL.revokeObjectURL should be removed (or made into a
> no-op).

If the author knows that the URL isn't being used anymore, I've nothing against the author revoking it. But the browser doesn't know, so the browser shouldn't do it automatically.


> This would mean that creating a blob URL would cause the blob's storage to be
> permanently unreleasable for the lifetime of the page.  Blobs tend to be used
> for nontrivially large objects, so for any page that uses blob URLs in anything
> but a singleton fashion, this would cause significant memory leaks.

Paging the data to disk is sufficient, and cheap.


> > Lots of things on the platform use URLs in a lazy
> > fashion. For example, <img src=""> doesn't necessarily load the image until the
> > user asks the browser to do so. All kinds of stuff in the HTML spec in fact
> > does fetching asynchronously, typically much later than the end of the current
> > task, let alone microtask.
> 
> This has been discussed on the list: each API that performs fetches
> asynchronously will need to take a reference to the underlying blob data.
> 
> For example, upon assigning a blob URL to img.src, HTMLImageElement would need
> to synchronously store the underlying data associated with the blob.  It can
> then perform the fetch whenever it wants, without being affected by the URL
> being freed, or the Blob being neutered due to transfers or Blob.close.

Are we going to be specifying all the different places that can happen? I guess if we specify it in detail that wouldn't be so bad.
================================================================================
 #8   Glenn Maynard                                   2012-04-18 21:44:42 +0000 
--------------------------------------------------------------------------------
(In reply to comment #7)
> If the author knows that the URL isn't being used anymore, I've nothing against
> the author revoking it. But the browser doesn't know, so the browser shouldn't
> do it automatically.

The browser does know, because the user has requested it explicitly:

var url = URL.createObjectUrl(myBlob, {autoRevoke: true});
img.src = url;

This avoids the glaring problem with object URLs, by making it impossible to accidentally leak myBlob.  (It also avoids the problems with previous "revoke on first use" proposals, by eliminating any possible dependencies on task order across task queues.)

> Paging the data to disk is sufficient, and cheap.

You're arguing that never freeing memory, and letting everything page to disk forever, is an acceptable memory management design?  That's crazy.  If I have an image viewer that receives high-resolution PNGs as blobs (for example, for examining or editing print-resolution images), and then hands them off to HTMLImageElement.src as object URLs to be viewed (or loaded into WebGL), that will quickly explode into gigabytes of data that has to be written to disk and then kept around for the lifetime of the page.

Also, my phone doesn't have a disk drive, and internal storage is very limited.

> Are we going to be specifying all the different places that can happen? I guess
> if we specify it in detail that wouldn't be so bad.

Hopefully it can be done in a way that minimizes the amount of work high-level specs like img.src and XHR need to do.
================================================================================
 #9   Arun                                            2012-04-18 22:05:51 +0000 
--------------------------------------------------------------------------------
Past discussion on the matter of oneTimeOnly or autoRevoke may have used "microtask" and "stable state" interchangeably (vis http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/1306.html).  "Microtask" as an isolated concept doesn't exist; it is merely a magic word for algorithm invocation.

This bug was spawned with the idea of defining microtask more rigorously, BUT mainly as a solution to the autoRevoke/oneTimeOnly conundrum.  IF stable state is a better solution for that, then I'm not sure we need to define microtask; it remains a magic word for mutation observers.  

Unless anyone disagrees with Comment 4 or finds reason why this bug is still useful, I'm happy to move on.
================================================================================
 #10  Glenn Maynard                                   2012-04-18 22:19:44 +0000 
--------------------------------------------------------------------------------
(In reply to comment #9)
> Past discussion on the matter of oneTimeOnly or autoRevoke may have used
> "microtask" and "stable state" interchangeably (vis
> http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/1306.html). 
> "Microtask" as an isolated concept doesn't exist; it is merely a magic word for
> algorithm invocation.

It refers to "perform a microtask checkpoint": http://www.whatwg.org/specs/web-apps/current-work/#perform-a-microtask-checkpoint

The idea was to perform the revoke at the next microtask checkpoint (but wasn't in those terms since I hadn't seen that definition yet back then).

> Unless anyone disagrees with Comment 4 or finds reason why this bug is still
> useful, I'm happy to move on.

(It does need review by somebody more familiar with the bigger picture, but we can always come back to this if it turns out that microtasks really are what it needs.)
================================================================================
 #11  Arun                                            2012-07-13 21:18:20 +0000 
--------------------------------------------------------------------------------
It would seem that this is prudent to reopen, following discussion here:

https://bugzilla.mozilla.org/show_bug.cgi?id=773132
================================================================================
 #12  Arun                                            2012-07-13 21:24:36 +0000 
--------------------------------------------------------------------------------
Can we maybe cement what a microtask is in HTML5, perhaps along the lines of what https://bugzilla.mozilla.org/show_bug.cgi?id=773132#c10 suggests, namely:

"Microtask is the outermost script execution.
MutationObserver callbacks get called at the end of the microtask or end of task, whichever is first."
================================================================================
 #13  Glenn Maynard                                   2012-07-13 21:56:02 +0000 
--------------------------------------------------------------------------------
(redirecting from https://bugzilla.mozilla.org/show_bug.cgi?id=773132)

> Script (event listener etc) can't know what all scripts will run before the next stable state.
> Microtask doesn't really have that problem.

I was thinking that with stable states,

<script>url = URL.createObjectURL(blob, {autoRevoke: true});</script>
<script>/* use url */</script>

would free the URL before the second script runs.  Actually, the stable state happens when </script> is parsed, before the script is run.  (That might give the same effect, since that's still before the second script runs, but it might not work in all cases.)

Stable states are definitely wrong for:

url = createObjectURL(blob, {autoRevoke: true});
document.write("<script>console.log('y');</scr" + "ipt>");

where url should *not* be revoked by "</script>" being parsed.

Microtask checkpoints seem nondeterministic for this, though.  If the whole block is parsed in a single invocation of the parser, then there's no checkpoint until the parser returns, so the URL isn't revoked for the second script.  If there's a delay due to incremental parsing and we return to the event loop, the event loop gives us a checkpoint, so the URL *is* revoked between scripts.

Maybe this just needs its own concept: something that's always invoked when exiting the outermost script, including from parser-invoked scripts.
================================================================================

Comment 1 Glenn Maynard 2012-07-18 14:32:10 UTC

(Whoever thought doing this to the bug tracker was a good idea needs to be removed from the decision-making process.)

Comment 2 Ian 'Hixie' Hickson 2012-09-28 23:24:57 UTC

Mutation observers are also going to be fired (microtask checkpoint will fire) for </script>.

I'm not going to define "a microtask" in the spec, because that's not a good way to write specs.

If you want a hook to trigger a particular behaviour in some algorithm somewhere, file a separate bug, we can do that.

Comment 3 Arun 2012-10-15 18:16:48 UTC

I don't fully understand what you mean by "not a good way to write specs."  Bugs like Bug 17758 were filed because Mozilla engineers treat microtasks like a part of the HTML5 specification -- see https://bugzilla.mozilla.org/show_bug.cgi?id=773132#c10

You'd be doing implementors a favor by defining "outermost script execution" which has been mistakenly called a microtask. 

The use case would be for resources such as string representation for URLs coined using createObjectURL to be culled at the next microtask.  Neither "stable state" nor "performing a microtask checkpoint" are sufficient.

Comment 4 Arun 2012-10-16 15:15:21 UTC

See Bug 19554 for a successor conversation.