Bug 15986 - Specify exactly how and when ECMAScript arguments are evaluated
Specify exactly how and when ECMAScript arguments are evaluated
Status: RESOLVED FIXED
Product: WebAppsWG
Classification: Unclassified
Component: WebIDL
unspecified
All All
: P2 normal
: ---
Assigned To: Cameron McCormack
public-webapps-bugzilla
:
Depends on: 16040
Blocks:
  Show dependency treegraph
 
Reported: 2012-02-14 16:52 UTC by Aryeh Gregor
Modified: 2012-03-19 08:19 UTC (History)
8 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Aryeh Gregor 2012-02-14 16:52:19 UTC
Consider the following:

data:text/html,<!doctype html>
<script>
var x = 'unchanged';
try {
document.createRange().setStart(window,
{ valueOf: function() {x = 'changed'; return 0;} } );
} catch (e) {}
document.documentElement.textContent = x;
</script>

It outputs "unchanged" in Firefox 13.0a1, and "changed" in IE10 Developer Preview, Chrome 18 dev, and Opera Next 12.00 alpha.  This outputs "changed" in all four browsers:

data:text/html,<!doctype html>
<script>
var x = 'unchanged';
try {
document.createRange().compareBoundaryPoints(
{ valueOf: function() {x = 'changed'; return 0;} },
window);
} catch (e) {}
document.documentElement.textContent = x;
</script>

So it seems like Gecko evaluates the arguments one by one, and throws when it hits the first bad argument.  Other browsers seem to evaluate all at once.  Which is correct?

Also, with things like this that have side effects, it's important that the number of evaluations is defined.  Given this:

data:text/html,<!doctype html>
<script>
var cnt = 0;
document.createRange().setStart(
document.head,
{ valueOf: function() {cnt++; return 1;} });
document.documentElement.textContent = cnt;
</script>

Chrome outputs "2", and the other three output "1".  Presumably we want "1".

The spec should say exactly how all this works.  I'm not sure what the right answer is, actually.  We need to evaluate the arguments before we can run the overload resolution algorithm, but to evaluate the arguments we want to know what types we're trying to evaluate them as, right?
Comment 1 Boris Zbarsky 2012-02-14 17:35:30 UTC
There are two places that evaluate arguments in WebIDL.  One is the argument resolution algorithm at http://dev.w3.org/2006/webapi/WebIDL/#dfn-argument-resolution-algorithm and is defined to go one by one.

The other is the overload resolution algorithm, and the order it examines the arguments in depends on the precise overload set right now (because some IDL argument types require that the actual passed-in argument be examined and some do not, as far as I can tell).  It doesn't evaluate things per se, but it can return a null overload set after examining only some of the arguments, so can prevent entry into the argument resolution algorithm altogether.

So this is all specified already.  The question is whether the specification makes sense.  For example, does it make sense to run the overload algorithm on overload sets of size 1 to start with?
Comment 2 Allen Wirfs-Brock 2012-02-14 18:02:51 UTC
A clarification, because the title of this bug may be misleading.

The evaluation order of the arguments to a function by the caller is fully specified by the ECMAScript specification.  The issue here seems to be the order in which the formal parameters (the values of the arguments after they have been evaluated as part of performing the ECMAScript call) are accessed by the callee in a manner that possibly produces observable side-effects. 

From an ECMAScript perspective, the callee includes both intermediary layers such as the Web IDL overload resolution process and the actual code of the resolved function.

To get a precise ordering of side-effects it is necessary that both the over-load resolution algorithm and the individual function specifications provide a full and complete ordering of all potentially side-effect producing operations (include ToString and ToNumber coercions). ECMAScript uses algorithmic specifications to ensure such an ordering for its built-in library functions

I believe that the Web IDL over-load resolution specification does define a precise ordering for such side-effects.

However, for the actual web api functions it isn't clear how you would define the ordering of observable side-effect if you don't want to use an algorithmic specification.

In summary, ECMAScript already does what it needs to do.  I believe (but this needs confirmation) that Web IDL does what it needs to do. However, implementations may not yet conform to the current  Web IDL specification. Once that occurs, it still leave every function defined using Web IDL as being responsible for specifying observable side-effect order.
Comment 3 Aryeh Gregor 2012-02-14 19:23:16 UTC
(In reply to comment #1)
> So this is all specified already.  The question is whether the specification
> makes sense.  For example, does it make sense to run the overload algorithm on
> overload sets of size 1 to start with?

Hmm, okay.  I do think that one desirable property of whatever we wind up with is that nothing should be evaluated more than once.  By "evaluated" I guess I mean "toString() or valueOf() is called" -- anything that might have side effects or be nondeterministic.

This is currently the case per spec, right?  The only thing that will call those is conversion to IDL values, and that's only in the argument resolution algorithm?  If so, the current spec seems reasonable enough.

The second test-case from comment #0 indicates that Gecko doesn't follow the spec, doesn't it?  The overload resolution algorithm should throw before any arguments are converted to IDL types, so it should output "unchanged".


Observation: in practice, almost no operations are overloaded, and the few that are could be converted to have optional arguments or use union types.  What's an example of a place where we need overloading at all?  For instance, in HTML, the following canvas operations

  void drawImage(HTMLImageElement image, double dx, double dy);
  void drawImage(HTMLImageElement image, double dx, double dy, double dw, double dh);
  void drawImage(HTMLImageElement image, double sx, double sy, double sw, double sh, double dx, double dy, double dw, double dh);
  void drawImage(HTMLCanvasElement image, double dx, double dy);
  void drawImage(HTMLCanvasElement image, double dx, double dy, double dw, double dh);
  void drawImage(HTMLCanvasElement image, double sx, double sy, double sw, double sh, double dx, double dy, double dw, double dh);
  void drawImage(HTMLVideoElement image, double dx, double dy);
  void drawImage(HTMLVideoElement image, double dx, double dy, double dw, double dh);
  void drawImage(HTMLVideoElement image, double sx, double sy, double sw, double sh, double dx, double dy, double dw, double dh);

could become

 void drawImage(
   (HTMLImageElement or HTMLCanvasElement or HTMLVideoElement) image,
   double x1, double y1,
   optional double w1, optional double h1,
   optional double x2, optional double y2,
   optional double w2, optional double h2);

with an extra couple lines of prose to handle the corner cases.  I don't suppose we could just get rid of overloading?

(In reply to comment #2)
> The evaluation order of the arguments to a function by the caller is fully
> specified by the ECMAScript specification.  The issue here seems to be the
> order in which the formal parameters (the values of the arguments after they
> have been evaluated as part of performing the ECMAScript call) are accessed by
> the callee in a manner that possibly produces observable side-effects. 

Do you mean this?  http://es5.github.com/#x11.2.4  That calls GetValue() on the arguments.  If I'm reading things correctly, calling GetValue() on {valueOf: function() { return 'foo'; }} will not invoke the valueOf function.  So any calls of that function will be only because of WebIDL.

> However, for the actual web api functions it isn't clear how you would define
> the ordering of observable side-effect if you don't want to use an algorithmic
> specification.

Operations in WebIDL operate on IDL values, and accessing IDL values can't have side effects, I don't think.  E.g., if you have an operation that accepts a lon and pass {valueOf: function() { return 7 }}, WebIDL will call ToNumber() on it, and then pass the IDL value '7' to the actual operation.  So I think we only have to worry about WebIDL here.

> In summary, ECMAScript already does what it needs to do.  I believe (but this
> needs confirmation) that Web IDL does what it needs to do. However,
> implementations may not yet conform to the current  Web IDL specification. Once
> that occurs, it still leave every function defined using Web IDL as being
> responsible for specifying observable side-effect order.

It seems to me that Boris is right: everything is well-defined, but it's not clear that WebIDL's definitions are what we actually want.
Comment 4 Boris Zbarsky 2012-02-14 19:34:47 UTC
> This is currently the case per spec, right? 

I believe so, yes.

> What's an example of a place where we need overloading at all?

Good question.  Overloading predates union types...

> The second test-case from comment #0 indicates that Gecko doesn't follow the
> spec, doesn't it?

Well, since the spec postdates the relevant Gecko code, it couldn't exactly "follow" it.  ;)

We're working on WebIDL-compliant binding code right now, though.  And again, I think that for the one-overload case it's not clear that running overload resolution as the spec does makes sense.
Comment 5 Allen Wirfs-Brock 2012-02-14 20:52:41 UTC
(In reply to comment #3)
> (In reply to comment #1)
...
> could become
> 
>  void drawImage(
>    (HTMLImageElement or HTMLCanvasElement or HTMLVideoElement) image,
>    double x1, double y1,
>    optional double w1, optional double h1,
>    optional double x2, optional double y2,
>    optional double w2, optional double h2);
> 
> with an extra couple lines of prose to handle the corner cases.  I don't
> suppose we could just get rid of overloading?

Note this is pretty much exactly what an ECMAScript programmer would do to implement a similar set of operations using pure ECMAScript.  The discrimination of the first argument type and the optional arguments would be done in the logic of the function body. If ECMAScript is used to implement operations defined using Web IDL then a single function like this would still be used.  However, the Web IDL spec. requires that the function internally perform the web IDL  over-load resolution algorithm, even if it isn't the most efficient or effective way to implement this specific combination of logical operations.  

Overload resolution is a one-size-fits-all solution.


> 
> (In reply to comment #2)
> 
> Do you mean this?  http://es5.github.com/#x11.2.4  That calls GetValue() on the
> arguments.  If I'm reading things correctly, calling GetValue() on {valueOf:
> function() { return 'foo'; }} will not invoke the valueOf function.  So any
> calls of that function will be only because of WebIDL.

yes, that is correct.

You can think of ValueOf as the internal specification operations that ECMAScript uses to force evaluation of an expression.  In the case of a call like:

foo(1+2,{valueOf: function(){return 4},3+{valueOf: function(){return 4})

ValueOf is applied in left to right order to each of the argument expressions producing 3, an object with a valueOf method, and 7.  These are the values that are initially assigned to the formal parameters of foo. If evaluating any of the argument expressions had side-effects they would have occurred in left to right order of the expressions. 

This ValueOf has no direct relationship with the user level valueOf property defined by the object in the second argument expressions. However, evaluating the third argument includes evaluating the + operator and evaluating + applies the ToNumber internal operation to objects and ToNumber calls the valueOf method.

> ...
> Operations in WebIDL operate on IDL values, and accessing IDL values can't have
> side effects, I don't think.

I wonder, might a object defined in ECMAScript meet all the type requirements of an Web IDL interface while still producing side-effects on certain property accesses.  This probably starts to approach the controversial topic of whether duck typed ECMAScript objects are allowable as implementations of Web IDL interfaces or is some form of nominal typing required.
Comment 6 Boris Zbarsky 2012-02-14 21:13:58 UTC
> I wonder, might a object defined in ECMAScript meet all the type requirements
> of an Web IDL interface while still producing side-effects on certain property
> accesses.

Only for callback interfaces and dictionaries (for which the spec using them would in fact have to define exact behavior).

> or is some form of nominal typing required.

Right now the spec requires it, yes.  There's all sorts of state that DOM objects of various sorts store internally but don't expose via public APIS that DOM objects need to be able to get to, such that in most cases duck typing would at best lead to exceptions thrown early on (just like nominal typing) and at worst lead to data structures in undefined states...

This does complicate a pure ES implementation of the DOM, of course.
Comment 7 Cameron McCormack 2012-02-14 23:55:48 UTC
Yes, argument values are consistently evaluated/converted left to right.  Regarding invoking the overload resolution algorithm even when there is only a single operation, consider the following.  Assume we didn't call the overload resolution algorithm with a single operation, and start with:

  void f(long x, Node n);

Call f({ valueOf: function() { throw "hi" }}, window).  This will result in "hi" being thrown, since we just do the argument conversion from left to right without checking the values that were passed in before doing the conversions.

If we add an overload:

  void f(long x, Node n);
  void f();

then calling f({ valueOf: function() { throw "hi" }}, window) will now throw a TypeError due to there being no appropriate overload (window didn't match Node).

I think it is preferable to avoid this kind of behaviour change for existing calls when introducing a new overload.
Comment 8 Aryeh Gregor 2012-02-15 15:27:46 UTC
Now that union types exist, is there any reason to continue supporting overloading at all?  It would be a considerable simplification to get rid of it.
Comment 9 Aryeh Gregor 2012-02-15 15:28:36 UTC
(Alternatively, I'd say get rid of union types.  One way or another, I don't think we need both.)
Comment 10 Boris Zbarsky 2012-02-15 15:43:34 UTC
I recall what the argument for keeping overloads was now.

With just unions and optional args, there is no way to express this overload set:

  void f(long x, long y);
  void f(long x, long y, long width, long height);

which some specs definitely wanted...
Comment 11 Aryeh Gregor 2012-02-15 15:53:16 UTC
That only requires a very special case of overloading -- where one variant's arguments are a prefix of the other's.  If that's all we need, then replace the overload resolution algorithm with just throwing an exception if the number of arguments is wrong.  No type-checking would be needed at all.  So HTML's drawImage would become

  void drawImage(
    (HTMLImageElement or HTMLCanvasElement or HTMLVideoElement) image,
    double dx, double dy);
  void drawImage(
    (HTMLImageElement or HTMLCanvasElement or HTMLVideoElement) image,
    double dx, double dy, double dw, double dh);
  void drawImage(
    (HTMLImageElement or HTMLCanvasElement or HTMLVideoElement) image,
    double sx, double sy, double sw, double sh,
    double dx, double dy, double dw, double dh);

(Although this seems like not a great API anyway -- too many arguments that can mean different things depending on how you call it.  It doesn't seem like the end of the world if we require extra prose to support this case.)
Comment 12 Boris Zbarsky 2012-02-15 16:00:10 UTC
> If that's all we need, then replace the overload resolution algorithm with just
> throwing an exception if the number of arguments is wrong.

I would personally be fine with that.  Can we survey existing usage of overloads in specs?
Comment 13 Cameron McCormack 2012-02-20 02:02:25 UTC
(In reply to comment #12)
> I would personally be fine with that.  Can we survey existing usage of
> overloads in specs?

All the existing uses of overloads that I could find are below.  Many of them can be be replaced with uses of optional and union types.  The ones that cannot be rewritten to use only overloading with different argument list lengths (where shorter ones are all prefixes of the longer ones) are BlobBuilder.append, CanvasRenderingContext2D.createImageData and DataTransferItemList.add.


File API:

  [Constructor,
   Constructor(any[] blobParts, optional BlobPropertyBag options)]
  interface Blob {
    ...
  };

File Writer:

  interface BlobBuilder {
    void append(in DOMString text, optional in DOMString endings);
    void append(in Blob data);
    void append(in ArrayBuffer data);
  };

HTML:

  interface HTMLOptionsCollection : HTMLCollection {
    void add(HTMLOptionElement element, optional HTMLElement? before);
    void add(HTMLOptGroupElement element, optional HTMLElement? before);
    void add(HTMLOptionElement element, long before);
    void add(HTMLOptGroupElement element, long before);
  };

  partial interface Document {
    Document open(optional DOMString type, optional DOMString replace);
    WindowProxy open(DOMString url, DOMString name, DOMString features,
                     optional boolean replace);

    boolean execCommand(DOMString commandId);
    boolean execCommand(DOMString commandId, boolean showUI);
    boolean execCommand(DOMString commandId, boolean showUI, DOMString value);
  };

  [NamedConstructor=Image(),
   NamedConstructor=Image(unsigned long width),
   NamedConstructor=Image(unsigned long width, unsigned long height)]
  interface HTMLImageElement : HTMLElement {
    ...
  };

  [NamedConstructor=Audio(),
   NamedConstructor=Audio(DOMString src)]
  interface HTMLAudioElement : HTMLMediaElement {
  };

  interface CanvasRenderingContext2D {
    CanvasPattern createPattern(HTMLImageElement image, DOMString repetition);
    CanvasPattern createPattern(HTMLCanvasElement image, DOMString repetition);
    CanvasPattern createPattern(HTMLVideoElement image, DOMString repetition);

    void drawImage(HTMLImageElement image,
                   double dx, double dy);
    void drawImage(HTMLImageElement image,
                   double dx, double dy, double dw, double dh);
    void drawImage(HTMLImageElement image,
                   double sx, double sy, double sw, double sh,
                   double dx, double dy, double dw, double dh);
    void drawImage(HTMLCanvasElement image,
                   double dx, double dy);
    void drawImage(HTMLCanvasElement image,
                   double dx, double dy, double dw, double dh);
    void drawImage(HTMLCanvasElement image,
                   double sx, double sy, double sw, double sh,
                   double dx, double dy, double dw, double dh);
    void drawImage(HTMLVideoElement image,
                   double dx, double dy);
    void drawImage(HTMLVideoElement image,
                   double dx, double dy, double dw, double dh);
    void drawImage(HTMLVideoElement image,
                   double sx, double sy, double sw, double sh,
                   double dx, double dy, double dw, double dh);

    ImageData createImageData(double sw, double sh);
    ImageData createImageData(ImageData imagedata);

    void putImageData(ImageData imagedata, double dx, double dy);
    void putImageData(ImageData imagedata, double dx, double dy,
                      double dirtyX, double dirtyY,
                      double dirtyWidth, double dirtyHeight);
  };

  interface HTMLSelectElement : HTMLElement {
    void add(HTMLOptionElement element, optional HTMLElement? before);
    void add(HTMLOptGroupElement element, optional HTMLElement? before);
    void add(HTMLOptionElement element, long before);
    void add(HTMLOptGroupElement element, long before);
  };

  [NamedConstructor=Option(),
   NamedConstructor=Option(DOMString text),
   NamedConstructor=Option(DOMString text, DOMString value),
   NamedConstructor=Option(DOMString text, DOMString value,
                           boolean defaultSelected),
   NamedConstructor=Option(DOMString text, DOMString value,
                           boolean defaultSelected, boolean selected)]
  interface HTMLOptionElement : HTMLElement {
    ...
  };

  interface WindowTimers {
    long setTimeout(Function handler, optional long timeout, any... args);
    long setTimeout(DOMString handler, optional long timeout, any... args);

    long setInterval(Function handler, optional long timeout, any... args);
    long setInterval(DOMString handler, optional long timeout, any... args);
  };

  interface DataTransferItemList {
    DataTransferItem? add(DOMString data, DOMString type);
    DataTransferItem? add(File data);
  };

  [Constructor(DOMString url, optional DOMString protocols),
   Constructor(DOMString url, optional DOMString[] protocols)]
  interface WebSocket : EventTarget {
    ...
  };

XHR:

  interface XMLHttpRequest : XMLHttpRequestEventTarget {
    void send();
    void send(ArrayBuffer data);
    void send(Blob data);
    void send(Document data);
    void send(DOMString? data);
    void send(FormData data);
  };
Comment 14 Allen Wirfs-Brock 2012-02-20 02:40:05 UTC
(In reply to comment #13)
> (In reply to comment #12)
> > I would personally be fine with that.  Can we survey existing usage of
> > overloads in specs?
> 
> All the existing uses of overloads that I could find are below.  Many of them
> can be be replaced with uses of optional and union types.  The ones that cannot
> be rewritten to use only overloading with different argument list lengths
> (where shorter ones are all prefixes of the longer ones) are
> BlobBuilder.append, CanvasRenderingContext2D.createImageData and
> DataTransferItemList.add.
> 

I would think that if there are only these three API that require overloads then it would be much more economical to treat them as special cases then it would be to provide the full complexity of the overload mechanism. 

If necessary, those three could be describe as (any ... args) with a prose description of the legal argument combinations. 

However, some of them could be more specific, for example the arguments for DataTransferItenList.add might be expressed as:
((DOMString or File) data, optional DOMString type)

with the prose specification saying that the second argument is only legal when the first argument is a DOMString.


Implementations for languages with overloads could always manually map these few prose descriptions back into actual language specific overloads.

Using this technique and eliminating overloading would  probably also discourage the creation of future APIs that are difficult to
Comment 15 Cameron McCormack 2012-02-20 03:49:49 UTC
(In reply to comment #14)
> However, some of them could be more specific, for example the arguments for
> DataTransferItenList.add might be expressed as:
> ((DOMString or File) data, optional DOMString type)
> 
> with the prose specification saying that the second argument is only legal when
> the first argument is a DOMString.

Yes, I think we should encourage spec writers to be as specific as possible here.

The specific changes we would make then are:

  interface BlobBuilder {
    void append(in (Blob or ArrayBuffer or DOMString) data,
                in optional DOMString endings);

    // Prose for append says to throw a TypeError if endings is
    // specified when data is not a DOMString.
  };

  interface CanvasRenderingContext2D {
    ImageData createImageData((ImageData or double) arg1, optional double arg2);

    // Looks strange, and obscures the two valid invocations of the operation.
    // Prose for createImageData says to throw a TypeError if arg2 is specified
    // and arg1 is not a double.  The alternative would be:
    //
    //   ImageData createImageData(any... args);
    //
    // and write more prose to say how to convert non-Number non-ImageData
    // values that were passed as the first argument, and how to convert
    // non-Number values passed as the second argument.
  };

  interface DataTransferItemList {
    DataTransferItem? add((File or DOMString) data, optional DOMString type);

    // Prose for add says to throw a TypeError if type is specified
    // when data is not a DOMString.
  };

CCing Ian for his take on the latter two, since they're in his spec.
Comment 16 Boris Zbarsky 2012-02-20 06:06:51 UTC
> The ones that cannot be rewritten to use only overloading with different
> argument list lengths (where shorter ones are all prefixes of the longer ones)

Why is the parenthetical a requirement?  For example, for createImageData seems like the arg count is enough to identify the overload.  Same for DataTransferItemList.add.

Even BlobBuilder.append could be rewritten as follows:

  void append((DOMString or Blob or ArrayBuffer) data);
  void append(DOMString text, DOMString endings);

which would allow using the arg count to select the correct one of the two overloads....

Or is the proposal to get rid of argcount-based overloading altogether (except for the special case optional arguments provide)?  But then you also can't express the case from comment 10.
Comment 17 Cameron McCormack 2012-02-20 06:29:32 UTC
(In reply to comment #16)
> > The ones that cannot be rewritten to use only overloading with different
> > argument list lengths (where shorter ones are all prefixes of the longer ones)
> 
> Why is the parenthetical a requirement?  For example, for createImageData seems
> like the arg count is enough to identify the overload.  Same for
> DataTransferItemList.add.
> 
> Even BlobBuilder.append could be rewritten as follows:
> 
>   void append((DOMString or Blob or ArrayBuffer) data);
>   void append(DOMString text, DOMString endings);
> 
> which would allow using the arg count to select the correct one of the two
> overloads....
> 
> Or is the proposal to get rid of argcount-based overloading altogether (except
> for the special case optional arguments provide)?  But then you also can't
> express the case from comment 10.

I guess I was still thinking that we would want to avoid different types in the same position in the argument list for some reason, but if we are looking at argcount solely, then you're right it doesn't matter.  I'm happy with this.

As a reminder, if we have cases like

  ImageData createImageData(double sw, double sh);
  ImageData createImageData(ImageData imagedata);

then calling createImageData(0) will throw a TypeError (0 is not an ImageData object) and calling createImageData(anImageData, 0) will be equivalent to calling createImageData(NaN, 0).  That's the same as the behaviour required today.
Comment 18 Aryeh Gregor 2012-02-20 18:51:20 UTC
So basically we'd require that any pair of operations on the same interface with the same name must require a different number of arguments.  For instance, foo(int a, optional int b) cannot coexist with any other operation named "foo" that takes one or two arguments, but could coexist with one that takes zero, or three.  That would mean the overload resolution algorithm could be scrapped, and it looks like no new prose would be needed anywhere.  Then this bug is easy to resolve.  Sounds like a win-win situation to me.
Comment 19 Cameron McCormack 2012-03-06 04:08:58 UTC
Dave Herman pointed out another current use of overloads in the Typed Arrays spec, which defines interfaces like this:

  [Constructor(unsigned long length),
   Constructor(Uint32Array array),
   Constructor(unsigned long[] array),
   Constructor(ArrayBuffer buffer,
                 optional unsigned long byteOffset,
                 optional unsigned long length)]
  interface Uint32Array ...

which would need to be rewritten as:

  [Constructor((unsigned long or Uint32Array or
                unsigned long[] or ArrayBuffer) init),
   Constructor(ArrayBuffer buffer, unsigned long byteOffset,
               optional unsigned long length)]
  interface Uint32Array ...

so we have to use a less useful argument name there.  It's also a fair bit less clear to someone reading the IDL what the different allowable invocations are.  Is that acceptable?  I find the top one much more readable, anyway, and if possible I think it would be good to keep allowing it.

I should point out that if we make overload resolution look only at argument list length that we still need to do much of the same inspections of JS values for union types.  Step 10 and onwards of the overload resolution algorithm is pretty much the same as the steps in http://dev.w3.org/2006/webapi/WebIDL/#es-union.

What are the concrete bad things about the current overload resolution algorithm?

Boris brought up the fact that it is currently invoked even when there is only a single operation.  In off-bug discussion, I talked about the difference between running the overload resolution algorithm and not and that I wanted to remove the difference between these two cases so that we didn't have different exception throwing behaviour if we introduce an overload later on.  The example I had was that you start with this:

  void f(long x, Node n);

and say content does this call:

  f({ valueOf: function() { throw "hi" } }, window);

If we do not call the overload resolution algorithm, since there's only a single IDL operation, the result will be "hi" being thrown, since we just do left-to-right argument conversion.

If we then later introduce an overload,

  void f(long x, Node n);
  void f();

this means the above call would first invoke the overload resolution algorithm, which would throw TypeError because there was no match (window isn't a Node).  


Here's another option off the top of my head: we could make it so that arguments to the left of the one that is used to determine which overload is selected always get converted first.  So if we start off with:

  void g(long x, Node y, Node z);

then calling

  g({ valueOf: function() { throw "hi" } }, 0, 0);

would throw "hi", and if we introduce another overload later:

  void g(long x, Node y, Node z);
  void g(long x, Window y, Window z);

then it would still throw "hi" because the steps would be:

  1. Eliminate the overloads whose arg count doesn't match.
  2. Convert the first argument.
  3. Inspect the second argument value, and use it to select the overload (or
       throw TypeError if none are appropriate).
  4. Convert the third argument according to which overload we selected.

We would need to have a restriction that all arguments to the left of the one used to discriminate the overloads are all the same type, for a given arg count.

This would allow us to keep the more readable overloads.
Comment 20 Aryeh Gregor 2012-03-13 16:57:32 UTC
(In reply to comment #19)
> What are the concrete bad things about the current overload resolution
> algorithm?

I think the major bad thing is that arguments are evaluated an unpredictable number of times.  If I call a WebIDL operation on an object with valueOf, I expect it to be called either zero times or once, and predictably so.  The simplest thing would be to evaluate all arguments in advance, but I guess we can't do that because it would require knowing in advance which arguments we need to convert to primitives and which we want to leave as-is.

But frankly this is a relatively marginal concern.  At this point I think we've spent about as much effort on it as it deserves, if not more.  If the current spec is unambiguous and makes enough sense in common cases, I don't have a problem with leaving it alone.  The problem is a lot hairier than I had initially assumed, because I hadn't thought about the fact that identifying the type of an object can have side-effects in JS.
Comment 21 Boris Zbarsky 2012-03-13 19:02:50 UTC
Identifying the type of an object should not have side-effects.  I don't believe it does in the current spec.
Comment 22 Ms2ger 2012-03-14 09:53:24 UTC
I think it makes sense to only allow one overload for each number of arguments, and handle the current overloads with unions. For the typed array case, that would make the argument name a little less useful, but *shrug*. OTOH, it would rather simplify the overload resolution algorithm (maybe even to the point where I actually understand it). There is still some complexity in the conversion to the IDL union type, but at we can convert those one at a time (left-to-right).
Comment 23 Aryeh Gregor 2012-03-14 17:56:08 UTC
(In reply to comment #21)
> Identifying the type of an object should not have side-effects.  I don't
> believe it does in the current spec.

Right, so I'm still confused.  :)  The problem was that the overload resolution algorithm could have confusing effects on which arguments are evaluated.  Every argument is still always evaluated either zero times or once per current spec.  It's just confusing which ones are evaluated.
Comment 24 Cameron McCormack 2012-03-14 22:49:58 UTC
I think then my suggestion at the bottom of comment 19 will help with that.  After selecting based on argcount, it will always proceed converting arguments from left to right until it finds a type that makes the call unviable.
Comment 25 Cameron McCormack 2012-03-19 08:19:06 UTC
I have gone ahead with that solution.

http://dev.w3.org/cvsweb/2006/webapi/WebIDL/Overview.xml.diff?r1=1.480;r2=1.481;f=h

The effect is that operation invocation is handled as follows:

  * Get the effective overload set.
  * Remove items whose length does not match exactly the number of JS arguments.
  * Convert arguments from JS to IDL values from left to right; when the
      argument being converted is the _distinguishing index_, the JS value
      is first inspected to select which overload we'll be invoking.
      (Any arguments to the left of the distinguishing index are required to
       have the same type in all overloads of this length.)
  * Add any argument default values from the IDL.

So for example if you had

  /* f1 */ void f();
  /* f2 */ void f(long a, float b, Node c, optional Node? d);
  /* f3 */ void f(long a, float b, DOMString c, optional Window? d);

and you call it as

  f(1, 2, document, window)

then the effective overload set is:

{ <f1, ()>,
  <f2, (long, float, Node)>,
  <f2, (long, float, Node, Node?)>,
  <f3, (long, float, DOMString)>,
  <f3, (long, float, DOMString, Window?)> }

You'd select the entries of length 4 (since that's how many JS values you are passed):

{ <f2, (long, float, Node, Node?)>,
  <f3, (long, float, DOMString, Window?)> }

Then we go from left to right:

  * Convert JS Number 1 to a long.
  * Convert JS Number 2 to a float.

We're up to the distinguishing index, so we inspect the JS value document.  It is a platform object, and there is an argument at index 2 with an interface type that matches (step 14.4 of the algorithm) so we select that entry:

  <f2, (long, float, Node, Node?)>

Now we continue with argument conversion:

  * Convert JS Object reference document to Node.
  * Convert JS Object reference window to Node? -- this throws TypeError.

So in the end an exception is thrown and we fail.

Let me know if this is acceptable, thanks.