Bug 21555 - Use of IDL arrays for keyPath values is underdefined
Use of IDL arrays for keyPath values is underdefined
Status: RESOLVED FIXED
Product: WebAppsWG
Classification: Unclassified
Component: Indexed Database API
unspecified
PC All
: P2 normal
: ---
Assigned To: This bug has no owner yet - up for the taking
public-webapps-bugzilla
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-04-02 18:07 UTC by Boris Zbarsky
Modified: 2013-05-09 14:13 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Boris Zbarsky 2013-04-02 18:07:48 UTC
There are several uses of IDL arrays in IndexedDB API.  Specifically, IDBObjectStore.keyPath, IDBIndex.keyPath, IDBObjectStoreSync.keyPath, and IDBIndexSync.keyPath.

All of these have the same general structure.  The IDL looks sort of like this:

    readonly    attribute (DOMString or DOMString[])? keyPath;

(except that IDBIndex is missing the "or") and prose that says:

     On getting, returns the key path of this object store. This will be either
     a DOMString, an Array of DOMStrings or null. 

(without the bits about null for the IDBIndex/IDBIndexSync, since those are not nullable).

Unfortunately:

1)  An IDL array is not in fact a JS Array, so the prose is wrong in claiming
    that an Array is returned.

2)  Simply declaring something to be an IDL array in the IDL does not define
    its behavior.  Specifically, there are three different types of IDL arrays:
    fixed length, variable length, and readonly.  Which one is being returned
    here needs to be defined in the prose.  What happens on writes if the array
    is not readonly also needs to be defined in the prose.

In terms of UA behavior, what Gecko does, based on code inspection, is that it creates an actual JS Array object when you first do .keyPath, and hands it out.  After that it keeps handing out the same Array object every time.  The script that got it can modify the Array object, and the modifications have no effect on anything other than what other .keyPath getters see.

What Chrome does is to return a DOMStringList.  Furthermore, it returns a new DOMStringList every time, so doing this:

  alert(store.keyPath == store.keyPath);

in Chrome alerts false.

Both behaviors seem fairly broken to me, for what it's worth, neither one matches the current IDL in the spec, and the current spec can't be interoperably implemented given just the information in it anyway....
Comment 1 Boris Zbarsky 2013-04-02 18:11:09 UTC
So I guess the first question is whether the return value should be a new object each time or the same object each time in cases when the keyPath is not just a string.  Generally having it be a new object each time is bad form for an attribute.
Comment 2 Joshua Bell 2013-04-02 18:43:31 UTC
(In reply to comment #1)
> So I guess the first question is whether the return value should be a new
> object each time or the same object each time in cases when the keyPath is
> not just a string.  Generally having it be a new object each time is bad
> form for an attribute.

Agreed. (I believe there was some mailing list churn about adding an IDL attribute for this?) 

IDBCursor.value has this explicitly:

"Note that if this property returns an object, it returns the same object instance every time it is inspected, until the cursor's value is changed. This means that if the object is modified, those modifications will be seen by anyone inspecting the value of the cursor. However modifying such an object does not modify the contents of the database."

Wordy; it'd be nice to have a way to express this succinctly in the IDL. (Also, since it's normative, we should probably ditch the "Note that")

(In reply to comment #0)
> (except that IDBIndex is missing the "or") and prose that says:

Fixed, thanks. https://dvcs.w3.org/hg/IndexedDB/rev/f3da35b35633

> In terms of UA behavior, what Gecko does, based on code inspection, is that
> it creates an actual JS Array object when you first do .keyPath, and hands
> it out.  After that it keeps handing out the same Array object every time. 
> The script that got it can modify the Array object, and the modifications
> have no effect on anything other than what other .keyPath getters see.

FWIW, I prefer Gecko's behavior, and I think we can switch Chrome over to that.

For reference, in IE10 createIndex("name", ["a", "b"]).keyPath yields the string "a,b". (ISTR that IE10 does not support array-type key paths.)

> Both behaviors seem fairly broken to me, for what it's worth, neither one
> matches the current IDL in the spec, and the current spec can't be
> interoperably implemented given just the information in it anyway....

Reading the WebIDL spec, it seems that the Gecko behavior could be described by altering the prose to say:

"This will be either a DOMString, a variable length DOMString array or null. If this property returns an array, it returns the same instance on every access to the property, and modifications to the array have no affect on the object store."

Is that sufficient?
Comment 3 Joshua Bell 2013-04-02 18:44:39 UTC
(In reply to comment #2)
> 
> "This will be either a DOMString, a variable length DOMString array or null.
> If this property returns an array, it returns the same instance on every
> access to the property, and modifications to the array have no affect on the
> object store."

s/property/attribute/ to match WebIDL parlance
Comment 4 Boris Zbarsky 2013-04-02 18:53:25 UTC
> it seems that the Gecko behavior could be described by altering the prose
> to say

Given the IDL, there is no way to describe the Gecko behavior.  Again, because IDL arrays are not actual JS Array objects.

For example, consider this snippet, assuming store.keyPath is an array:

  var keyPath = store.keyPath;
  keyPath[5] = { toString: { alert('gotcha'); return "xyz"; } };

In Gecko, there is no alert and keyPath[5] is an object.  For a variable-length WebIDL array, there will be an alert and keyPath[5] will be the string "xyz".

Unfortunately, IDL does not have a way to actually return the same exact actual Array object each time, short of returning "any" and then defining it all in prose or something.  That's why I cced Cameron on this bug...

One other note: if Chrome is switched to the Gecko behavior, how well would it handle this:

  store.keyPath.amIGoingToLeak = store;

?  Gecko handles that via the cycle collector, but I was given to understand that WebKit doesn't have such a thing....
Comment 5 Joshua Bell 2013-04-02 19:04:45 UTC
(In reply to comment #4)
> Given the IDL, there is no way to describe the Gecko behavior.  Again,
> because IDL arrays are not actual JS Array objects.

Hrm, yes... I was assuming the wiggle room in "An array returned from a platform object *might* also be modified by the object, and any modifications to the array performed by user code *might* be acted upon by the platform object" (emphasis mine on *might*) would allow it, but the platform array object definition precludes this. Bother.

> Unfortunately, IDL does not have a way to actually return the same exact
> actual Array object each time, short of returning "any" and then defining it
> all in prose or something.  That's why I cced Cameron on this bug...

Thanks, yes. Avoiding "any" would be nice, but we could go back to that.
Comment 6 Joshua Bell 2013-04-02 19:06:56 UTC
(In reply to comment #4)
> One other note: if Chrome is switched to the Gecko behavior, how well would
> it handle this:
> 
>   store.keyPath.amIGoingToLeak = store;
> 
> ?  Gecko handles that via the cycle collector, but I was given to understand
> that WebKit doesn't have such a thing....

I believe (but I'm still a n00b here) the way we'd generally handle this is that the persistent keyPath value ends up stashed on the store's JS wrapper object and thus the cycle only occurs in JS and hence is reclaimable; if no references to the wrappers exist they can be reclaimed. But we may have bugs in this area.
Comment 7 Boris Zbarsky 2013-04-02 19:14:14 UTC
> ends up stashed on the store's JS wrapper object and

Ah, interesting.  Wonder how well that will work once you move to accessor getters on the prototype.

Anyway, the setup is clearly fine by me, since it's not an issue for Gecko.  ;)
Comment 8 Joshua Bell 2013-04-02 19:31:48 UTC
FWIW, the change from "any" to "(DOMString or DOMString[])?" was only intended to simplify the spec and not be a normative change. Switching back to "any" is entirely feasible since that how it was in the spec originally.

Bug: https://www.w3.org/Bugs/Public/show_bug.cgi?id=17649
Change: https://dvcs.w3.org/hg/IndexedDB/rev/dffdbbe300ab

... but it would be nice to avoid "any" if we can.
Comment 9 Boris Zbarsky 2013-04-02 19:33:10 UTC
Note that the spec was underdefined before that change too, though (e.g. didn't define whether a new object was returned or not).
Comment 10 Cameron McCormack 2013-04-03 00:09:53 UTC
(In reply to comment #2)
> "This will be either a DOMString, a variable length DOMString array or null.
> If this property returns an array, it returns the same instance on every
> access to the property, and modifications to the array have no affect on the
> object store."
> 
> Is that sufficient?

Let me just ask for a clarification.  When you say "a variable length DOMString array" does that mean you are fine with it being a "platform array object", i.e. a host object that behaves a bit like a native Array, but not exactly?  To be clear, that's what "DOMString[]" means.

Web IDL specifically disallows sequence<T>, which means "a native Array object whose values are copied out and converted to type T when you pass it in to a method, and creates a new Array object each time when returned from a method", from being used as the type of an IDL attribute.  This is exactly to avoid the issue Boris mentions in comment 1, where you probably don't want to return a newly created object every time you get the attribute.

So the wording above won't get you anything similar to Gecko's behaviour, and you can't actually get this behaviour unless you write the type as "any" or "object" and then in prose describe exactly when native JS Array objects get created and returned.

My feeling is that using DOMString[] is the better option, since it is less confusing than returning an Array object that could be modified but which has no effect.  You could then say in prose that the same object is always returned from this IDL attribute (you would have the object update itself to reflect the curreny key path), and you would also declare the object returned as "read only" http://dev.w3.org/2006/webapi/WebIDL/#dfn-read-only-array.
Comment 11 Joshua Bell 2013-04-03 19:10:59 UTC
(In reply to comment #10)
> Let me just ask for a clarification.  When you say "a variable length
> DOMString array" does that mean you are fine with it being a "platform array
> object", i.e. a host object that behaves a bit like a native Array, but not
> exactly?  To be clear, that's what "DOMString[]" means.

I think we'd prefer to just return a native Array. Requiring a platform array object puts an additional burden on implementers.

> Web IDL specifically disallows sequence<T>, which means "a native Array
> object whose values are copied out and converted to type T when you pass it
> in to a method, and creates a new Array object each time when returned from
> a method", from being used as the type of an IDL attribute.  This is exactly
> to avoid the issue Boris mentions in comment 1, where you probably don't
> want to return a newly created object every time you get the attribute.

Agreed, but to be clear this is not a method that will be called very often. It's used for introspecting an immutable property of an object. i.e. given:

var store = db.transaction('store').objectStore('store');

The value of store.keyPath can never change for the lifetime of the store object.
 
> My feeling is that using DOMString[] is the better option, since it is less
> confusing than returning an Array object that could be modified but which
> has no effect.  You could then say in prose that the same object is always
> returned from this IDL attribute (you would have the object update itself to
> reflect the curreny key path), and you would also declare the object
> returned as "read only"

Since it's immutable, we don't need to worry about updates. I believe that using DOMString[] in the IDL would be satisfied by an implementation that returned a DOMStringList which is what Chrome does already (although it hands out a fresh copy on each access).

But that puts a burden on Gecko and ISTM that having a way to say "just returns a native array" without "any" would be ideal. The spec already needs to deal with the "same value each time" bit in prose for cursor.key/cursor.value.
Comment 12 Boris Zbarsky 2013-04-03 19:13:56 UTC
> I believe that using DOMString[] in the IDL would be satisfied by an
> implementation that returned a DOMStringList

No, it wouldn't.  There are specific implementation requirements on IDL arrays that DOMStringList doesn't satisfy (as a simple example, the prototype chain is wrong).

> ISTM that having a way to say "just returns a native array"

Agreed, in general.  Basically, we want something like sequence<> but that doesn't imply creating a new object each time...

On the other hand, doing keyPath that way does mean its unreliable if anyone else gets their hands on the objects.  Not sure whether that matters.
Comment 13 Cameron McCormack 2013-04-03 22:35:46 UTC
OK.  I'm a little reluctant to suggest this as it means we then have three array-looking types, but:  How about we have an IDL type named Array (and possibly Array<T>) which means "a reference to a modifiable array object" which in JS is a reference to a native Array object.  If we care to keep the distinction between abstract IDL types and language binding specific types (I'm still trying to), we could define some operations that specs could invoke on such objects, like modifying array element values, changing length, creating a new object with a bunch of values, etc.  For IndexedDB's use, we could start just with "creating a new object with a bunch of values".
Comment 14 Joshua Bell 2013-04-26 18:29:10 UTC
IIRC the tone of the F2F http://www.w3.org/wiki/Webapps/April2013Meeting correctly, we agreed to revert the spec's IDL here to "any" and define the behavior in prose.

If that sounds reasonable, I'll make the spec changes:
* IDL type of the attributes --> any
* Prose defined return type as String or Array of Strings
* Prose defines that the same object is returned on each access
Comment 15 Joshua Bell 2013-04-29 22:27:30 UTC
(In reply to comment #14)
> IIRC the tone of the F2F http://www.w3.org/wiki/Webapps/April2013Meeting
> correctly, we agreed to revert the spec's IDL here to "any" and define the
> behavior in prose.
> 
> If that sounds reasonable, I'll make the spec changes:
> * IDL type of the attributes --> any
> * Prose defined return type as String or Array of Strings
> * Prose defines that the same object is returned on each access

I've gone ahead and made this change in these revisions:

https://dvcs.w3.org/hg/IndexedDB/rev/43e59fb2e864
https://dvcs.w3.org/hg/IndexedDB/rev/6671aff10209 (copy/paste fix)

Boris - sanity check please?
Comment 16 Boris Zbarsky 2013-05-01 21:36:11 UTC
Seems reasonable, I guess.

The only remaining issue is that something should make clear that the Array is returned if the keypath is a sequence and that when initially created it contains the strings from the sequence.

In particular, the object returned from the .keyPath getter is NOT the same object that was passed in via the init dictionary.
Comment 17 Joshua Bell 2013-05-01 22:12:05 UTC
(In reply to comment #16)
> The only remaining issue is that something should make clear that the Array
> is returned if the keypath is a sequence and that when initially created it
> contains the strings from the sequence.
> 
> In particular, the object returned from the .keyPath getter is NOT the same
> object that was passed in via the init dictionary.

Thanks. I updated the text be more explicit and cover both of those points.
Comment 18 Boris Zbarsky 2013-05-09 14:13:10 UTC
Thanks, that looks great.