This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 12067 - Improve error handling in workers
Summary: Improve error handling in workers
Status: RESOLVED FIXED
Alias: None
Product: WebAppsWG
Classification: Unclassified
Component: Web Workers (editor: Ian Hickson) (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: public-webapps-bugzilla
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-14 22:10 UTC by Jonas Sicking (Not reading bugmail)
Modified: 2011-06-21 21:21 UTC (History)
4 users (show)

See Also:


Attachments

Description Jonas Sicking (Not reading bugmail) 2011-02-14 22:10:57 UTC
The error handling for workers is currently somewhat lacking. Consider a scenario when window A creates dedicated worker B which creates dedicated worker C. Currently if an error occurs in C, the following error handling occurs:

1. Fire "error" event on the global object for worker C.
2. Fire "error" event on the worker object associated with C.
3. Fire "error" event on the worker object associated with B.
(and so on for each "parent" dedicated worker)
4. Report error to the user.

If at any point an error is canceled (by calling event.preventDefault while the event is being fired), the steps are aborted an no new events are fired.

There are two, related, problems here:

First off window.onerror isn't involved. This generally works as a choke point to catch any programming errors, but currently doesn't catch errors in workers. Instead developers are required to add a event listener to every worker created and have that listener forward the errors to whatever code handles window.onerror.

The second problem is that the same problem occurs within a worker. I.e. if a worker creates several workers, and want to catch all errors occurring within itself or its "sub workers", it needs to both listen to globalscope.onerror as well as the error event on any and all workers it creates.


To fix this, I suggest the following error propagation:

1. Fire "error" event on the global object for worker C.
2. Fire "error" event on the worker object associated with C.
3. Fire "error" event on the global object for worker B.
4. Fire "error" event on the worker object associated with B.
(and so on for each "parent" dedicated worker)
5. Fire "error" event on window A.
6. Report error to the user.

As before, if an error is cancelled, the steps are aborted and no further events are fired.

This allows any worker to discover, and handle, programming errors within itself by simply listening to the error events on its scope. It also allows catching all unhandled programming errors in the context of a page by simply listening to window.onerror.

This is what we have implemented in firefox for as long as we've had workers, so it's unlikely to break the world.


For shared workers, the propagation should stop after firing the "error" event on the global object for the shared worker. This is similar to how error handling works on shared workers today.
Comment 1 Jonas Sicking (Not reading bugmail) 2011-02-14 23:18:50 UTC
As an additional aside, if a change is made to the spec here (which I of course hope will happen), it would be nice if the "fire a worker error event" algorithm was written in such a way that it starts by firing an error on a given worker-global-scope. That way the algorithm can more easily be reused by other specs, such as IndexedDB, that want to fire errors at a given global scope.
Comment 2 Ian 'Hixie' Hickson 2011-02-15 01:57:37 UTC
Please file the request in comment 1 as a separate bug. I should be able to provide appropriate hooks for other specs to use regardless of how we fix this bug.
Comment 3 Ian 'Hixie' Hickson 2011-02-15 02:07:51 UTC
When I wrote this, my assumption was that (globalscope).onerror was the way to catch a local error, and then you might propagate it, in the case of workers, to the parent.

What you describe is a different model, where there's no way to catch a local error separate from a child's error, and that (globalscope).onerror is a generic mechanism for catching any error.

I can see value in both. I'm not sure which is most useful, off hand. One advantage of the first model is that you can implement either model using it, whereas with the second model you can't catch local errors. (To implement the second model, you just wrap the creation of a worker in a method that adds the onerror handler and runs the same one as for the global scope which catches local errors.)

What do implementations do so far?
Comment 4 Jonas Sicking (Not reading bugmail) 2011-02-26 01:08:49 UTC
(In reply to comment #3)
> When I wrote this, my assumption was that (globalscope).onerror was the way to
> catch a local error, and then you might propagate it, in the case of workers,
> to the parent.

How would you propagate it to a parent? Using postMessage and some onmessage machinery on the parent side? I agree that works, but is a fair amount of work. One lesson learned over the years for me is that people are lazy about error handling, so making it easy is a very important goal if we want people to do it.

> What you describe is a different model, where there's no way to catch a local
> error separate from a child's error, and that (globalscope).onerror is a
> generic mechanism for catching any error.
> 
> I can see value in both. I'm not sure which is most useful, off hand. One
> advantage of the first model is that you can implement either model using it,
> whereas with the second model you can't catch local errors. (To implement the
> second model, you just wrap the creation of a worker in a method that adds the
> onerror handler and runs the same one as for the global scope which catches
> local errors.)

That is not true. If you want to catch only local errors you can add a onerror handler to all created workers which calls .preventDefault on the error event.

As far as I can see both models support both behaviors, it's just a question of which is the default behavior.

(Also, I'd argue that in the model suggested in comment 0, it's easier to get "only local errors", compared to the amount of work required under the current spec model to get "all errors")

> What do implementations do so far?

Gecko does what comment 0 suggests.

I'm not sure what webkit supports. It doesn't have window.onerror at all, so not sure if it has onerror in worker global scopes.
Comment 5 Ian 'Hixie' Hickson 2011-06-21 21:18:58 UTC
> That is not true. If you want to catch only local errors you can add a onerror
> handler to all created workers which calls .preventDefault on the error event.

That's a good point.

Ok, I've done what comment 0 suggests.

If comment 1 is still an issue please catch me on IRC or file another bug. I'm happy to make the spec have whatever hooks are useful for other specs, but it's hard to know exactly what other specs need without a concrete case in mind.
Comment 6 contributor 2011-06-21 21:21:02 UTC
Checked in as WHATWG revision r6263.
Check-in comment: Make worker errors propagate all the way up to the Document, hitting the global scope .onerror each step of the way. This isn't a great way to specify it but I couldn't work out a cleaner way that didn't involve major (potentially risky) surgery and inventing new terms. If it turns out that there are other things that'd need the parent/child infrastructure to be better defined I'll do it then.
http://html5.org/tools/web-apps-tracker?from=6262&to=6263