This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 12248 - Make objects first-class API citizens
Summary: Make objects first-class API citizens
Status: RESOLVED FIXED
Alias: None
Product: WebAppsWG
Classification: Unclassified
Component: WebIDL (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Cameron McCormack
QA Contact: public-webapps-bugzilla
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 12101
  Show dependency treegraph
 
Reported: 2011-03-05 10:28 UTC by Anne
Modified: 2011-05-31 02:42 UTC (History)
13 users (show)

See Also:


Attachments

Description Anne 2011-03-05 10:28:45 UTC
To be able to use objects in APIs throughout the web platform we need to have objects that do not allow getters. http://lists.w3.org/Archives/Public/public-webapps/2011JanMar/0756.html

For initializing events you we objects where you can retrieve values of keys without side effects but the values of those keys can be pretty much anything. Including what Web IDL currently calls "object".

I am not sure what the requirements for Indexed DB are.
Comment 1 Boris Zbarsky 2011-03-05 14:15:59 UTC
> but the values of those keys can be pretty much anything. Including what Web
> IDL currently calls "object"

Note that this still introduces issues similar to getters.  Consider this object:

  { clientX: 
    { valueOf: 
      function() {
        /* Do evil stuff */ 
      } 
    } 
  }
Comment 2 Brendan Eich 2011-03-05 16:22:18 UTC
Why do we need such an unnaturally restricted object?

I've read the public-webapps thread and no one has made the case. Ojan et al. cite precedent for allowing objects as keyword parameter sets. ES5 itself does this, and ES5 specifies getters. Sicking listed running getters in a well-defined order as a possible solution but no one said anything about this idea.

The trick (such as it is) is to process parameters left to right, and normalize object parameters by something like ES5's ToPropertyDescriptor abstract internal operation (8.10.5 in ES5.1). If someone wants to write a getter and do unusual things from it, that's their funeral.

/be
Comment 3 Boris Zbarsky 2011-03-05 16:32:13 UTC
Brendan, as long as evaluation order is defined, and as long as all DOM APIs that take such objects specify exactly what happens in "evil" getter cases (closes window, spins event loop, reenters this API, whatever), it's fine.  But that's a pretty noticeable specification and implementation burden...

The other option is to leave behavior in such cases undefined, except implementors still have to worry about it.  We've certainly had at least one security issue in Gecko so far due to this sort of thing.
Comment 4 Allen Wirfs-Brock 2011-03-05 17:19:47 UTC
(In reply to comment #3)
> Brendan, as long as evaluation order is defined, and as long as all DOM APIs
> that take such objects specify exactly what happens in "evil" getter cases
> (closes window, spins event loop, reenters this API, whatever), it's fine.  But
> that's a pretty noticeable specification and implementation burden...

As Brendan alluded to in Comment 2, the trick is to perform all [[Get]] accesses and cache values of the properties of concern prior to starting the core semantic processing of the API routine. Any side-effects that may occur  cannot interact with the API implementation because it hasn't started yet. It is essentially as if the caller had done the accesses prior to the call and passed the retrieved values.

When done in this manner the access order really should make many any difference to the implementation of the API functionality.  However, for maximal interoperability you probably what to specify it as well as any coercions that are done on the retrieved values. That's how we do it in the ES5 spec.

The specification burden seems minor in appropriate boiler plate language is defined:

"Let a,b,c,x,y,z be the values of the like-named properties of obj."   

If you are using some sort of "typed" description of the expected properties the coercions can be implicit. Concerning implementation burden.  All ECMAScript built-in already have to worry about such things. I don't see how web-app API can avoid it. The burden isn't excessive, it is an inherent tax for supporting a dynamically typed language.
Comment 5 Boris Zbarsky 2011-03-05 18:26:25 UTC
Allen, the last time we tried to specify something like this the spec writer gave up (querySelector namespace resolver).  Though in that case the problem was that we needed to look up properties at different times and couldn't look them up up front...

Also, the one place where this is currently an issue in HTML5 (structured cloning) totally fails to address this issue.  And so do the IndexedDB bits that use this pattern, last I checked.

I suppose we can try to get the editors to rewrite those bits in a way that would prevent problems, and see whether they succeed.
Comment 6 Ian 'Hixie' Hickson 2011-03-07 19:33:39 UTC
The bug for the "structured clone" spec is bug 12101.

I'm all in favour of supporting getters, I just have no idea how to do so. I'm happy to spec what happens in all the edge cases listed in comment 3, if someone can tell me what should actually happen in those cases. I'm actually more worried about the infinite case (where the getter unconditionally returns an object with a property with the same getter). Currently I'm leaning towards just not fetching data from properties that have getters.

(The issue of making sure the getters are fetched in the same order each time, and only fetched once each time, is far less of a problem than the above issues.)
Comment 7 Allen Wirfs-Brock 2011-03-07 19:34:38 UTC
(In reply to comment #5)

I'm not familiar with the querySelector namespace issues so I really can't comment on them yet.

I did look at structured cloning and other than not being a true clone and being rather underspecified I didn't see any insurmountable issues with making the algorithm explicitly deal with getter properties. The web storage use of structure clone appears to be one where the actual cloning can occur before any irreversible operation take place. I suspect other uses could also be given such a formulation.

It seems like a root issue that needs careful consideration is identification of where serialization boundaries must or must not exist. (By a serialization boundary I mean a point where direct object sharing is not possible or allowed; Object identify and behavioral elements of objects are lost and objects graphs are essentially be converted to to graphics of structs with primitive valued data properties).

Storing into IndexDB is clearing such a boundary.  But I don't think it would be reasonable to say that every call from a script into a web API crosses such a boundary. That would seem to require that essentially all interface specification have to give some consideration to these issues.  They individually need to either be identified as a serialization boundary that explicitly ensures that the serialization takes place or they will have to explicitly deal with the possibility of accessor side-effects.  You simply can't pretend that these issues don't exist.

> I suppose we can try to get the editors to rewrite those bits in a way that
> would prevent problems, and see whether they succeed.

It would probably be a good idea to provide some guide-lines and examples for how to do this. I'd be happy to help develop those.
Comment 8 Cameron McCormack 2011-03-07 21:28:32 UTC
(In reply to comment #7)
> Storing into IndexDB is clearing such a boundary.  But I don't think it would
> be reasonable to say that every call from a script into a web API crosses such
> a boundary. That would seem to require that essentially all interface
> specification have to give some consideration to these issues.  They
> individually need to either be identified as a serialization boundary that
> explicitly ensures that the serialization takes place or they will have to
> explicitly deal with the possibility of accessor side-effects.  You simply
> can't pretend that these issues don't exist.

What use cases do we have for that boundary not existing?  Is it unreasonable for all instances where UAs inspect JS objects passed to them to be able to handle exceptions, side effects and resource exhaustion?

If we can get away with always serializing so that we can deal with access side-effects up front, and not have to ignore accessor properties, I'd like to do that.
Comment 9 Brendan Eich 2011-03-08 02:55:48 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > Storing into IndexDB is clearing such a boundary.  But I don't think it would
> > be reasonable to say that every call from a script into a web API crosses such
> > a boundary. That would seem to require that essentially all interface
> > specification have to give some consideration to these issues.  They
> > individually need to either be identified as a serialization boundary that
> > explicitly ensures that the serialization takes place or they will have to
> > explicitly deal with the possibility of accessor side-effects.  You simply
> > can't pretend that these issues don't exist.
> 
> What use cases do we have for that boundary not existing?

ES5 and a bunch of JS libraries use objects as keyword parameter sets, with no serialization. You can define getters on the keyword parameter objects and they will be called, "in process", with whatever effects you contrive. This is part of JavaScript.

> Is it unreasonable
> for all instances where UAs inspect JS objects passed to them to be able to
> handle exceptions, side effects and resource exhaustion?

Yes.

JS engines have to handle runaway recursion, iloops, and OOM. This is currently not normatively specified with quotas or lower bounds. It is left to the market to deal with as a quality of implementation issue. That is not "ideal" but it is not a high priority to "fix" with more (and more complex) spec machinery.

> If we can get away with always serializing so that we can deal with access
> side-effects up front, and not have to ignore accessor properties, I'd like to
> do that.

Why not do what ES5 and all the JS libraries already do, and avoid unnatural (to JS) getter-free serialization semantics?

/be
Comment 10 Boris Zbarsky 2011-03-08 03:10:08 UTC
> This is currently not normatively specified

We would like to avoid that situation with HTML5.
Comment 11 Brendan Eich 2011-03-08 03:16:48 UTC
(In reply to comment #10)
> > This is currently not normatively specified
> 
> We would like to avoid that situation with HTML5.

Why?

Anyway, if you want to avoid solving the halting problem in spec code, you could still require time-based iloop/etc. policing. That can be done with prose. So do that and address Hixie's infinite getter case.

Great; now what?

This is a tempest in a teapot!

/be
Comment 12 Boris Zbarsky 2011-03-08 03:19:17 UTC
No, it's an attempt to avoid creating unspecifiable ratholes.
Comment 13 Brendan Eich 2011-03-08 03:25:27 UTC
(In reply to comment #12)
> No, it's an attempt to avoid creating unspecifiable ratholes.

Boris, I sense hostility. What's up?

No spec is complete. You know that. Economic law still applies. We do not specify everything, completely. Allen already observed that structured cloning is underspecified.

The rathole is in the minds of spec writers and compleatists here. I write this with no bad feelings. It's not a problem in practice for implementors or web developers, so long as the [[Get]] internal operations used to access "keyword parameters" are done before any other steps in the given method's spec, and in a fixed order.

Please respond to this without changing the subject to runaway recursion or other iloop equivalents. This was the first objection: that getters could have effects that would undermine the spec's integrity. I don't think it's true if you do what ES5 does. The integrity problem is solvable.

The "availability problem" is harder, and it need not be specified. Indeed it is not, to my knowledge. E.g. Gecko has limits on iframe nesting, not matching any spec, and sometimes biting (e.g. WebSphere) content. This is not a real-world high priority, or even a rathole with implementors or developers.

/be
Comment 14 Cameron McCormack 2011-03-08 03:33:14 UTC
(In reply to comment #13)
> It's not a problem in practice for implementors or web
> developers, so long as the [[Get]] internal operations used to access "keyword
> parameters" are done before any other steps in the given method's spec, and in
> a fixed order.

Just so that I am clear on the two choices here:

(A) if serializing, you would end up calling every getter on the object (and on all of the objects referenced by its properties, and so on) in a particular order, before actually doing the work in the method

(B) if not serializing, but handling accessors, the UA would just grab properties in whatever order it liked, at any time during the method

The former means spec writers don't need to think about when properties are accessed (since Web IDL defines it), but can lead to wasted work (and side effects!) calling getters for properties that the UA would never care about.

The latter means the UA can get away with less work, but spec writers need to specify when and in what order properties would be accessed.  (And I think there's a reasonable chance that not all spec writers will do this correctly.)

I was saying "let's always do (A)", and you are saying "(B) is more natural".  Both options are well defined and could be interoperably implemented.  (Correct me if I am wrong.)

(Option (B) also means that specification writers probably need to lower themselves to talking about the ECMAScript binding specifically, rather than whatever higher level IDL type this functionality would correspond to.)
Comment 15 Boris Zbarsky 2011-03-08 03:41:51 UTC
Brendan, there's no hostility.  I think there's just a difference in assumptions.

> No spec is complete.

Yes, but that may not be a good thing, given past web experience.

> Allen already observed that structured cloning is underspecified.

That's not a good thing either.

> It's not a problem in practice for implementors or web developers, so long as
> the [[Get]] internal operations used to access "keyword parameters" are done
> before any other steps in the given method's spec, and in a fixed order.

I could live with that, probably.  I said so in comment 3.

Repeating an earlier argument from private mail (with some irrelevant parts snipped):

Some of the people in this discussion have as a high priority preserving JS semantics in all cases, so they think losing non-data properties on objects passed as a way of grouping some data is a worse deal than the possibility of underspecified behavior due to weird getters.  I'm not entirely sure I agree myself; I see the use of getters here as a definite edge case, and at that point the question becomes how much that edge case is worth catering to.  Is it worth the fact that behavior will have to be explicitly undefined in some cases (and that this set of cases may not even be describably adequately)? 

Maybe it is.  Maybe it's not.
Comment 16 Brendan Eich 2011-03-08 03:50:55 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > It's not a problem in practice for implementors or web
> > developers, so long as the [[Get]] internal operations used to access "keyword
> > parameters" are done before any other steps in the given method's spec, and in
> > a fixed order.
> 
> Just so that I am clear on the two choices here:
> 
> (A) if serializing, you would end up calling every getter on the object (and on
> all of the objects referenced by its properties, and so on) in a particular
> order, before actually doing the work in the method
> 
> (B) if not serializing, but handling accessors, the UA would just grab
> properties in whatever order it liked, at any time during the method


We have mentioned ES5's ToPropertyDescriptor internal abstract operation here. It does (A). It "serializes" from a JS object into an internal data structure that is not visible to the programming language user.

The objections I saw were:

(1) What if the getter runs away?
(2) What if the getter mutates relevant objects (including the "keyword parameter set" object)?

The runaway or general availability property enforcement, we leave to quality of implementation.

The getter mutating any object is a matter of the developer fouling its own nest. The native method spec won't have its integrity compromised since it will not sample any effects prematurely.

> The former means spec writers don't need to think about when properties are
> accessed (since Web IDL defines it), but can lead to wasted work (and side
> effects!) calling getters for properties that the UA would never care about.

Why is the method in question getting a property (remember, for an optional keyword parameter) whose value it does not use?

> The latter means the UA can get away with less work, but spec writers need to
> specify when and in what order properties would be accessed.  (And I think
> there's a reasonable chance that not all spec writers will do this correctly.)

(B) is a straw man. Neither Allen nor I have proposed it. We're agreeing on (A) with getters invoked. It seems Hixie does not agree.

> I was saying "let's always do (A)", and you are saying "(B) is more natural". 

No, see above.

> Both options are well defined and could be interoperably implemented.  (Correct
> me if I am wrong.)

I don't think (B) is well-defined. "[W]hatever order it liked"?

> (Option (B) also means that specification writers probably need to lower
> themselves to talking about the ECMAScript binding specifically, rather than
> whatever higher level IDL type this functionality would correspond to.)

Option (B) is bogus. Let's get back to (A) since at least you, Allen, and I seem to agree on it!

/be
Comment 17 Brendan Eich 2011-03-08 04:01:21 UTC
(In reply to comment #15)
> > No spec is complete.
> 
> Yes, but that may not be a good thing, given past web experience.

My point about economics still applying stands. We don't have infinite funds to try to specify everything, and at formal but quite real level, we have Goedel's Incompleteness Theorem to contend with.

All specs are incomplete. Choosing one's spec-battles is important, and basing the choice on real-world pain points instead of only-in-spec-writers'-minds ratholes is crucial.

> Some of the people in this discussion have as a high priority preserving JS
> semantics in all cases, so they think losing non-data properties on objects
> passed as a way of grouping some data is a worse deal than the possibility of
> underspecified behavior due to weird getters.  I'm not entirely sure I agree
> myself; I see the use of getters here as a definite edge case, and at that
> point the question becomes how much that edge case is worth catering to.  Is it
> worth the fact that behavior will have to be explicitly undefined in some cases
> (and that this set of cases may not even be describably adequately)? 
> 
> Maybe it is.  Maybe it's not.

This is the wrong debate to have.

First, because we can't guess at good future uses of accessors, but they are part of JS, and developers are using them and will use them more in the future.

Second, because self-hosting, a native vs. JS implementation substitution principle, wants DOM methods to process their parameters as if they were written in JS [*], as much as possible. Every time we make some ad-hoc (or even, eventually, systematic) crippling of JS semantics, we make the DOM that much less self-hostable or virtualizable in JS.

If you don't care about these, then we can revert the recent WebIDL changes to make JS the primary binding language and to better match its semantics. We can have a "Java-like DOM" binding for all languages. But that is bad for web developers, and it is not the agreement we reached between w3c and Ecma folks trying to collaborate here.

The right argument to have IMHO is not some imponderable one about whether we should chip away at JS semantics in full, demanding proofs of negatives before restoring acccessor support. Instead, I contend we should consider exactly what properties (integrity, availability) to uphold and examine the threads posed by getters and setters.

If we find, as I believe, that we can uphold these properties without too much spec complexity and any loss of "JS generality", then we should stay the course of making WebIDL bindings in JS more "as if the method were written in JS".

/be

[*] It's true with ES5 one could write a method that interrogates an object "keyword parameter set" for all of its property names and skips accessors, but this is not done, either by JS libraries or by any built-in in ES5.
Comment 18 Cameron McCormack 2011-03-08 04:03:28 UTC
(In reply to comment #16)
> We have mentioned ES5's ToPropertyDescriptor internal abstract operation here.
> It does (A). It "serializes" from a JS object into an internal data structure
> that is not visible to the programming language user.
> 
> The objections I saw were:
> 
> (1) What if the getter runs away?
> (2) What if the getter mutates relevant objects (including the "keyword
> parameter set" object)?
> 
> The runaway or general availability property enforcement, we leave to quality
> of implementation.

Agreed.

> The getter mutating any object is a matter of the developer fouling its own
> nest. The native method spec won't have its integrity compromised since it will
> not sample any effects prematurely.
> 
> > The former means spec writers don't need to think about when properties are
> > accessed (since Web IDL defines it), but can lead to wasted work (and side
> > effects!) calling getters for properties that the UA would never care about.
> 
> Why is the method in question getting a property (remember, for an optional
> keyword parameter) whose value it does not use?

Well, not the *method* getting it, but the binding glue when marshalling arguments from JS to the DOM implementation.

> > The latter means the UA can get away with less work, but spec writers need to
> > specify when and in what order properties would be accessed.  (And I think
> > there's a reasonable chance that not all spec writers will do this correctly.)
> 
> (B) is a straw man. Neither Allen nor I have proposed it. We're agreeing on (A)
> with getters invoked. It seems Hixie does not agree.

Not a deliberate straw mind mind you.

> > I was saying "let's always do (A)", and you are saying "(B) is more natural". 
> 
> No, see above.

Understood now.

> > Both options are well defined and could be interoperably implemented.  (Correct
> > me if I am wrong.)
> 
> I don't think (B) is well-defined. "[W]hatever order it liked"?

Whatever order the specification defined.

> > (Option (B) also means that specification writers probably need to lower
> > themselves to talking about the ECMAScript binding specifically, rather than
> > whatever higher level IDL type this functionality would correspond to.)
> 
> Option (B) is bogus. Let's get back to (A) since at least you, Allen, and I
> seem to agree on it!

I'm not sure we do agree on it. :-)

The generalized version of ToPropertyDescriptor that would be used here would serialize the whole object graph starting from its argument.  (At least, that's how I imagined the serialization to work.)  That could well result in getting a property "whose value it [the method] does not use".

ToPropertyDescriptor itself doesn't look at properties other than "enumerable", "configurable", etc.  My (A) would look at every property on the object (and its objects (and its objects...)).
Comment 19 Brendan Eich 2011-03-08 04:03:37 UTC
> properties (integrity, availability) to uphold and examine the threads posed by

s/threads/threats/

/be
Comment 20 Brendan Eich 2011-03-08 04:05:11 UTC
> Not a deliberate straw mind mind you.

Yes, nor straw man (I presume -- you don't mean the Scarecrow from "The Wizard of Oz" :-P).

I didn't mean to say that was an intentional strawman fallacy -- sorry for the hint of that!

/be
Comment 21 Brendan Eich 2011-03-08 04:10:16 UTC
(In reply to comment #18)
> Well, not the *method* getting it, but the binding glue when marshalling
> arguments from JS to the DOM implementation.

[snip...]

> The generalized version of ToPropertyDescriptor that would be used here would
> serialize the whole object graph starting from its argument.  (At least, that's
> how I imagined the serialization to work.)  That could well result in getting a
> property "whose value it [the method] does not use".

How can you write a generalized serializer when only certain keyword parameters specific to the method being specified, are wanted?

You can abstract a helper that looks for a list of keys, and returns a list of values, say. But the processing for a given method-being-spec'ed will not first blindly clone (by serialization) the object passed as keyword-parameter set.

> ToPropertyDescriptor itself doesn't look at properties other than "enumerable",
> "configurable", etc.  My (A) would look at every property on the object (and
> its objects (and its objects...)).

Why? Can you show any DOM or WebAPI/Apps methods being proposed or already spec'ed that use such a "deep" keyword-parameter-set object?

What problem is being solved here? I thought the issue was how to spec a method M called like so (say on a DOM node N):

  N.M(arg1, arg2, {key1:val1,  ... keyN:valN});

where M interprets the final positional parameter by looking for a fixed set of keyword parameter property names.

/be
Comment 22 Ian 'Hixie' Hickson 2011-03-08 04:19:59 UTC
Just to be clear, I have no opinion on what we should do here or in bug 12101, so long as what we define is interoperably implementable and secure. The only situation I want to avoid is one in which today we allow a variety of different implementations, and a few years down the line we find that only one is viable on the Web due to market forces, and now we're forced to implement (and specify) the particular option that the market thus forced on us. If we can make this work with getters then so much the better.

I have no problem with comment 14's (A), my problem is just that I don't know how to specify that such that it is interoperably implementable. This is a practical problem the simplest solution for which is just for someone to tell me what to write, not a philosophical one and not a disagreement on my part.
Comment 23 Ian 'Hixie' Hickson 2011-03-08 04:21:46 UTC
Incidentally, in reply to comment #7:
> I did look at structured cloning and other than not being a true clone and
> being rather underspecified [...]

Please file bugs for any specific issues of underspecification, I'm eager to solve them. Thanks.
Comment 24 Allen Wirfs-Brock 2011-03-08 05:55:26 UTC
comment #15:
> Some of the people in this discussion have as a high priority preserving JS
> semantics in all cases, so they think losing non-data properties on objects
> passed as a way of grouping some data is a worse deal than the possibility of
> underspecified behavior due to weird getters.  I'm not entirely sure I agree
> myself; I see the use of getters here as a definite edge case, and at that
> point the question becomes how much that edge case is worth catering to.  Is it
> worth the fact that behavior will have to be explicitly undefined in some cases
> (and that this set of cases may not even be describably adequately)? 
> 
> Maybe it is.  Maybe it's not.

On of the first hard lessons I learned when I started working in this areas was that the web average developer does not make a distinction between JavaScript objects/JavaScript libraries and DOM objects/Web APIs.  They are all just JavaScript objects to the developer.  When DOM/Web App objects behave differently from other JavaScript objects or lack JavaScript object functionality or restrict normal use cases this just creates confusion for developers and makes the entire browser platform harder to learn and use.

Also, as time goes on you are going to have more of these situations. What about objects implemented via the JavaScript Proxy mechanism?  Will any Proxy based object be forbidden in web APIs?

Also, all the issues that are being brought up here already occur and are dealt with in the ECMAScript specification and JavaScript implementations.  Why would they be tractable in JavaScript but intractable in other parts of the web platform?

comment 14:
> Just so that I am clear on the two choices here:
> 
> (A) if serializing, you would end up calling every getter on the object (and on
> all of the objects referenced by its properties, and so on) in a particular
> order, before actually doing the work in the method


No, that's not how you do it, at least for the descriptor parameter case. You don't just arbitrary get the value of every property.  A descriptor parameter typically has a "syntax", properties that must or must not appear together, but have specific types or ranges of values, etc.  You process these just like you would process a syntax driven variable argument list (eg, printf). You process it according to whatever rule you define for each particular situation. 

comment #18:
> The generalized version of ToPropertyDescriptor that would be used here would
> serialize the whole object graph starting from its argument.  (At least, that's
> how I imagined the serialization to work.)  That could well result in getting a
> property "whose value it [the method] does not use".
>
> ToPropertyDescriptor itself doesn't look at properties other than "enumerable",
> "configurable", etc.  My (A) would look at every property on the object (and
> its objects (and its objects...)).

I think you need to distinguish the special case of processing arguments (especially descriptor arguments) to a specific API from the more general case of serializing an arbitrary object structure for storage or communication purposes.  The second case requires that you look at every property.  The first does not and is just likely to get you in trouble if you access properties that are explicitly used by the API.

Processing arguments is just part of specifying and implementing an API.  In many cases the order in which the arguments and argument properties are accessed could be arbitrary.  This is the case for some of the properties in ToPropertyDescriptor. However, the specification chooses an explicit order and interoperability requires that implementations follow that order.

One of the classic performance bugs in  dynamic object-oriented language based applications is excessive and redundant argument validation. If every method unnecessarily validates every argument, even ones that are just passed on to subsequent calls then your performance will suck.
Comment 25 Allen Wirfs-Brock 2011-03-08 06:12:46 UTC
(In reply to comment #22)

> I have no problem with comment 14's (A), my problem is just that I don't know
> how to specify that such that it is interoperably implementable. This is a
> practical problem the simplest solution for which is just for someone to tell
> me what to write, not a philosophical one and not a disagreement on my part.

This is exactly what the JSON.stringify specification in ES5 does. It is a serialization algorithm that deals with all of these issues except those that are linked to resource consumption.  Implementations that correctly follow the specification will have identical interoperable behavior, even in the presence of things like getters that dynamic modify the structure that is being serialized.  Only resource exhaustion issues (infinite recursion, etc.) don't have totally deterministic behavior among implementations.

JSON.stringify would be a good starting point for a more detailed structured clone algorithm. When I will still at Microsoft, I believe that Travis Leithead and I actually talked about doing this but we had higher priority issues to deal with at that time.
Comment 26 Brendan Eich 2011-03-08 06:47:00 UTC
(In reply to comment #16)
> (In reply to comment #14)
> > (A) if serializing, you would end up calling every getter on the object (and on
> > all of the objects referenced by its properties, and so on) in a particular
> > order, before actually doing the work in the method
> > 
> > (B) if not serializing, but handling accessors, the UA would just grab
> > properties in whatever order it liked, at any time during the method
> 
> We have mentioned ES5's ToPropertyDescriptor internal abstract operation here.
> It does (A). It "serializes" from a JS object into an internal data structure
> that is not visible to the programming language user.

Sorry, I read and then promptly forgot the part of (A) cited above that calls every getter (or gets every value, if a data property) on the object, and so on recursively.

Allen straigthened this out in comment 24.

I still don't know where this idea comes from. You are not mixing serialization or remoting with objects-as-keyword-parameter-sets, I hope. Remoting cannot be blind and "deep" over object parameters without consulting WebIDL types and in vs. inout vs. out.

(B) is still not proposed. So let's say (A) as above, vs. (C):

(C) if keyword-parameter-passing, call [[Get]] in a certain order on each name in the set of keyword-parameter names that is appropriate for the positional formal parameter in which this object actual parameter was passed, in the context of the method being specified. These calls to [[Get]] happen before anything else in the method's spec, apart from processing of keyword-parameter objects passed to lower (left of the current) position formal parameters.

Allen also mentioned Harmony Proxies. That reminds me: we intend to implement the Gecko DOM using Proxies in future, with handlers and traps that may or may not be written in JS.

A proxy client cannot tell what implements the handler, so there is no way for a method specified to reject getters to reject proxies with traps written in JS. All proxies would have to be rejected, including (in our future implementation) all DOM objects. This seems untenable if any object parameter could be a DOM object or a JS object.

/be
Comment 27 Cameron McCormack 2011-03-08 22:18:42 UTC
(In reply to comment #21)
> How can you write a generalized serializer when only certain keyword parameters
> specific to the method being specified, are wanted?
> 
> You can abstract a helper that looks for a list of keys, and returns a list of
> values, say. But the processing for a given method-being-spec'ed will not first
> blindly clone (by serialization) the object passed as keyword-parameter set.

I was thinking more of the structured clone use case (where you would serialize everything), and then extending it to work for this keyword parameter use case.

> > ToPropertyDescriptor itself doesn't look at properties other than "enumerable",
> > "configurable", etc.  My (A) would look at every property on the object (and
> > its objects (and its objects...)).
> 
> Why? Can you show any DOM or WebAPI/Apps methods being proposed or already
> spec'ed that use such a "deep" keyword-parameter-set object?

For the keyword parameter use case, no.

> What problem is being solved here? I thought the issue was how to spec a method
> M called like so (say on a DOM node N):
> 
>   N.M(arg1, arg2, {key1:val1,  ... keyN:valN});

I didn't realise that was all we were trying to solve.

Some specs (such as ones from the DAP WG) are currently allowing keyword parameter objects like this just by defining an interface:

  interface N {
    void M(in long arg1, in long arg2, in KeywordParams kwparams);
  };

  interface KeywordParams {
    attribute long key1;
    attribute DOMString key2;
    attribute object key3;
    ...
  };

So to do what comment 26 suggests, we can add some steps in here:

  http://dev.w3.org/2006/webapi/WebIDL/#es-interface

under step 4, which would do a [[Get]] for each property that corresponds to an IDL attribute in the order that they appear in the IDL.  We'd also need to state that whenever the implementation is getting the value of the IDL attribute of such an object, it uses the value that was [[Got]] up front.

We could consider introducing new syntax for keyword parameters, instead of forcing it through an interface.  That might be better, as (in theory) bindings for languages that have native support for named parameters would be nicer.

(In reply to comment #24)
> I think you need to distinguish the special case of processing arguments
> (especially descriptor arguments) to a specific API from the more general case
> of serializing an arbitrary object structure for storage or communication
> purposes.  The second case requires that you look at every property.  The first
> does not and is just likely to get you in trouble if you access properties that
> are explicitly used by the API.

OK.  Is it worth having an IDL type that means "the kinds of values you can get out of the structured clone algorithm"?  ("StructuredData"?)  That makes sense to me for plain objects, arrays and primitives, but the structured clone algorithm deals with various other types too: Date, RegExp, ImageData, File, Blob, and FileList.
Comment 28 Brendan Eich 2011-03-08 22:58:47 UTC
(In reply to comment #27)
> (In reply to comment #21)
> > How can you write a generalized serializer when only certain keyword parameters
> > specific to the method being specified, are wanted?
> > 
> > You can abstract a helper that looks for a list of keys, and returns a list of
> > values, say. But the processing for a given method-being-spec'ed will not first
> > blindly clone (by serialization) the object passed as keyword-parameter set.
> 
> I was thinking more of the structured clone use case (where you would serialize
> everything), and then extending it to work for this keyword parameter use case.

That is mixing unrelated use cases. Why?

Remember we are not trying to make every API "remoteable" -- even if we were, there would be a proxy on the sending side, which would still run JS due to implicit type conversions per WebIDL. There is no reason to cripple objects as keyword parameter sets passed to such a proxy's method.

Structured cloning is used by IndexedDB in Gecko, where we want to throw on functions and other non-enumerated type cases. Indeed this says to throw on getters and setters, not skip them.

These are quite different use-cases which should not be conflated.

/be
Comment 29 Cameron McCormack 2011-03-16 01:42:44 UTC
Another requirement: the Geolocation spec wants to distinguish between whether a property was specified on their keyword-params object (step 1.4 of http://dev.w3.org/geo/api/spec-source.html#get-current-position), something which isn't possible currently with interfaces-implemented-by-JS.
Comment 30 Allen Wirfs-Brock 2011-03-16 02:47:38 UTC
(In reply to comment #29)
What do you mean by "isn't possible currently with interfaces-implemented-by-JS"?  In JavaScript, this is a property existence test that can be performed using Object.prototype.hasOwnProperty or Object.getOwnProperty.  This exact pattern is used quite extensively by the ES5 Object.defineProperty to process its "options parameter".

Are you saying that host object interfacing layers in browser don't provide the capability for such tests or is something else the issue?
Comment 31 Cameron McCormack 2011-03-16 02:56:19 UTC
(In reply to comment #30)
> What do you mean by "isn't possible currently with
> interfaces-implemented-by-JS"?  In JavaScript, this is a property existence
> test that can be performed using Object.prototype.hasOwnProperty or
> Object.getOwnProperty.  This exact pattern is used quite extensively by the ES5
> Object.defineProperty to process its "options parameter".
> 
> Are you saying that host object interfacing layers in browser don't provide the
> capability for such tests or is something else the issue?

I just mean that at the abstract IDL level, there is no concept of whether an attribute was specified or not -- all attributes of a [Callback] interface implemented by JS have a value, just by virtue of the wording currently in http://dev.w3.org/2006/webapi/WebIDL/#native-objects.

So Web IDL, as part of fixing this bug, needs to map that JS-property-present-or-not to something at the IDL level, so that specifications can reference it in a language independent manner.

(This doesn't prevent specifications from writing JS-specific requirements if they want to, but having Web IDL know about omitted keyword-param object properties will make it easier for binding generators.)
Comment 32 Allen Wirfs-Brock 2011-03-16 17:37:11 UTC
(In reply to comment #31)
> 
> I just mean that at the abstract IDL level, there is no concept of whether an
> attribute was specified or not -- all attributes of a [Callback] interface
> implemented by JS have a value, just by virtue of the wording currently in
> http://dev.w3.org/2006/webapi/WebIDL/#native-objects.

"Some interfaces can be implemented in script by an ECMAScript native object. Only interfaces with the following characteristics can have native object implementations: "

This continues to be a problematic restriction as we eventually intend to implement all DOM interfaces using ECMAScript native objects.
 
> 
> So Web IDL, as part of fixing this bug, needs to map that
> JS-property-present-or-not to something at the IDL level, so that
> specifications can reference it in a language independent manner.

It would presumably be easy enough to add some sort of "optional" modifier to IDL attributes and ECMAScript would have no trouble dealing with this concept. I'd be more concerned about the mapping to more static languages that expect all objects/structures to have a fixed set of fields.

One way to deal with this would be to specify all such option parameter objects using a single common "OptionBag" interface. That interface would represent a dynamic set of key/value pairs and the interface would  expose hasOption and getOptionValue methods. Any method that uses an OptionBag parameter would specify (perhaps tabularly) the actual option keys and values that it accepted.  This could then be mapped into any language.  For ECMAScript, the bind could just be to an native object with hasOption/getOptionValue mapping to hasOwnProperty and regular property access.

Alternatively, you might want to look at the primary reason for the existence of these option objects. They're there to support an ECMAScript idiom for passing optional keyword arguments.  Some languages have direct support for optional keyword arguments.  Someone doing a WebIDL language binding for such a language would probably want to directly use that language mechanism rather than passing an options object.  However, specifying a method signature with a with an explicit parameter object typed with some interface (even OptionBag) would seem to preclude using such native optional argument techniques.

What you could do instead, is simply permit WebIDL method signatures to have an arbitrary number of optional keyword arguments. In an interface definition these would be all be specified as discrete arguments.  They would not be specified as being grouped together into an object.  The specification for each method would also define its behavior simply in terms of the presence or absence of various arguments. It also would not be expressed in terms of an options object.

The mapping of optional keyword arguments into in a single options objects would only be discussed as part of the ECMAScript language binding or the bindings of other languages that wanted to use that technique.  The language bindings for other languages that directly support optional keyword arguments would map such arguments using the native language facility.
Comment 33 Boris Zbarsky 2011-03-16 17:43:43 UTC
> This continues to be a problematic restriction as we eventually intend to
> implement all DOM interfaces using ECMAScript native objects.

I think that would require some pretty major changes to the way all the various web specs are being written.

For example, right now if interface Foo defines that member x is unsigned long, then callees who take instances of Foo expect that the x they get is in fact nonnegative.  If the implementation of the callee extracts the underlying C++ object from the ES object it has and calls into it directly, it has such a guarantee.  If it needs to somehow deal with arbitrary props on the ES object instead, then it needs to define a bunch of edge cases which are not defined in any of the specs involved right now.
Comment 34 Allen Wirfs-Brock 2011-03-16 19:14:57 UTC
(In reply to comment #33)
WebIDL is a specification language not an implementation language.  The types used in WebIDL are specification constraints.  In the case of unsigned long is a constant on the range of acceptable values

How that constraint is satisfied should be a matter for the language binding and when multiple languages are used a matter for the foreign function call protocols and mechanisms used to enable such calls.

The restriction I'm quoting is part of section 4, of the WebIDL spec.  This is the the ECMAScript binding definition. In that context I think the restriction needs to be interpreted as meaning "this language binding is incomplete because it doesn't not define everything that is needed to fully implement WebIDL interface specifications using ECMAScript."

This may be adequate to describe current browser implementations, but it won't be adequate for the future.

I opened bug #12320 concerning this specific issue.
Comment 35 Jonas Sicking (Not reading bugmail) 2011-03-16 22:40:18 UTC
Possibly what we want to do here is word things in terms of a dictionary, where ECMAScript bindings can use a plain JS object to implement such a dictionary. The set of keys in the dictionary would be the set of names iterated when enumerating the object, and the values for those keys would be the values returned when "getting" those properties from the object.

This might not cover the structured-clone algorithm, but that might be worth handling separately. Maybe.
Comment 36 Cameron McCormack 2011-03-16 22:54:32 UTC
I think it's worth looking at both of the two options Allen and Jonas mention: 

* either allow IDL operations to be declared to take named arguments, so that in the ECMAScript language binding that would correspond to taking an additional object argument holding the named arguments (and so that other language bindings could utilise their native mechanisms for named arguments); or

* add a dictionary type to Web IDL, which you would use directly as an argument type (similarly to how [Callback] interfaces are being used for this purpose in some specs), and which would let you define the types of individual values in it.

We might want both.  I like the first approach since it can map more easily to native mechanisms in other languages, but:

* what if a specification author wants to have an operation that takes more than one set of named arguments (for whatever reason; I don't know that we have any need for this currently, though)?

* what if a specification author wants to have the named arguments object not as the final argument (which is what I'm assuming we'd require)?
Comment 37 Jonas Sicking (Not reading bugmail) 2011-03-16 23:22:20 UTC
Note that in IndexedDB, functions take both required and optional arguments. We didn't make these functions take a single object that contains both the optional and required arguments, but rather did something like this:

db.createObjectStore("foo", { keyPath: "a_prop" });

http://dvcs.w3.org/hg/IndexedDB/raw-file/tip/Overview.html#widl-IDBDatabase-createObjectStore
Comment 38 Cameron McCormack 2011-05-26 04:33:26 UTC
I'm leaning towards adding a "dictionary" definition.  To run with the IndexedDB example:

  dictionary ObjectStoreOptions {
    DOMString keyPath;
    boolean autoIncrement;
  };

  interface IDBDatabase : EventTarget {
    ...
    IDBObjectStore createObjectStore
        (in DOMString name
         in optional ObjectStoreOptions optionalParameters);
  };

Dictionaries wouldn't have an "interface object" or "interface prototype object".  Dictionary members can be present or not present, and the prose for an interface can see this at the IDL level.  We could decide to add default values to the dictionary members if people really want it; otherwise, I'd say to just leave it to prose for now.

In JS dictionaries are represented by plain objects that you do a [[HasProperty]] to determine if the dictionary member was specified, and a [[Get]] to get its value.  (Alternatively, we could do a [[Get]] and treat undefined as meaning not specified.)  In the implementation of createObjectStore, it would get the values for each dictionary member in the order that they appear in the IDL (so first keyPath then autoIncrement).  There'd be no restriction on which argument positions dictionaries could be used in, so it's not quite the same as adding keyword argument functionality to operations.

As with sequences, dictionaries could be used as operation return types too, but not as the type of an attribute (or exception member).

This does not help structured cloning, but that seems like a special enough case to me to leave it to the HTML spec to define rather than including Web IDL machinery for it.
Comment 39 Jonas Sicking (Not reading bugmail) 2011-05-26 15:25:31 UTC
I think adding defaults would be nice since it's generally easier to see at a glance what the behavior should be.

In fact, should defaults even be required? Would you ever *not* want to have one?
Comment 40 Cameron McCormack 2011-05-26 21:57:14 UTC
I think there would be times when you want to know if a dictionary member wasn't specified, and where the behaviour of your operation isn't just as if a particular default value was assumed for that missing member.  For example:

  dictionary ImageDrawingOptions {
    unsigned long width;
    unsigned long height;
  };

  interface Whatever {
    void drawImage(in Image aImage, in optional ImageDrawingOptions options);
  };

Maybe you want to make it so that a call `whatever.drawImage(a, { width: 100 })` results in the height of the drawn image be determined by its aspect ratio.  You wouldn't be able to use a default value in the IDL there, and you need drawImage to be able to distinguish between present and not.  Unless you wanted to say

  dictionary ImageDrawingOptions {
    unsigned long? width = null;
    unsigned long? height = null;
  };

and then define that null means the same thing as the dictionary member not being present, but that seems less clean to me.
Comment 41 Jonas Sicking (Not reading bugmail) 2011-05-27 01:17:18 UTC
I guess I can buy that that is less clean.

But it would still be lovely to be able to specify defaults. But I can also live without it.
Comment 42 Cameron McCormack 2011-05-31 02:42:02 UTC
I've added the comment 40 proposal to the spec and allowed default values to be specified for dictionary members:

http://dev.w3.org/2006/webapi/WebIDL/#idl-dictionaries
http://dev.w3.org/2006/webapi/WebIDL/#idl-dictionary
http://dev.w3.org/2006/webapi/WebIDL/#es-dictionary