Bug 23332 - Support Binary Keys
Support Binary Keys
Status: RESOLVED LATER
Product: WebAppsWG
Classification: Unclassified
Component: Indexed Database API
unspecified
All All
: P2 normal
: ---
Assigned To: This bug has no owner yet - up for the taking
public-webapps-bugzilla
:
Depends on: 23369
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-23 21:33 UTC by Joshua Bell
Modified: 2014-12-23 19:55 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Joshua Bell 2013-09-23 21:33:02 UTC
Suggested by Joran Greef <joran@ronomon.com> 

Mailing list discussion:

http://lists.w3.org/Archives/Public/public-webapps/2012AprJun/0816.html

Summary: Seems like a good idea, but not for V1. Also tracked on:

http://www.w3.org/2008/webapps/wiki/IndexedDatabaseFeatures
Comment 1 Joshua Bell 2013-09-23 21:33:39 UTC
Marking RESOLVED=LATER since we're not doing this in v1, but we can dump ideas here.
Comment 2 Joshua Bell 2013-09-23 21:45:26 UTC
Initial thoughts:

* APIs that accept keys accept any kind of ArrayBufferView
* Raw ArrayBuffer is not accepted as an input type
* The input type is not retained; i.e. it doesn't matter if you pass in a Uint8Array, a Float64Array or a DataView, it's just the backing bytes (as seen through the view's offset/byteLength) that form the key
* APIs that emit keys return a new Uint8Array backed by a new ArrayBuffer
* Binary keys sort between strings and arrays
* Binary keys are compared like strings/arrays; bytewise comparison, otherwise longer is greater

Thoughts?
Comment 3 Jonas Sicking 2013-09-23 21:50:31 UTC
I think it would be confusing to accept a Float64Array, but then not sort according to float values.

Why not restrict to ArrayBuffers and Uint8Array?

Though it also feels strange to accept one type and then return another, so maybe limit to Uint8Array?
Comment 4 Joshua Bell 2013-09-23 22:51:34 UTC
(In reply to Jonas Sicking from comment #3)
> I think it would be confusing to accept a Float64Array, but then not sort
> according to float values.

That's exactly why I tossed the straw-man up. :) It felt a little odd when I was prototyping it.

> Why not restrict to ArrayBuffers and Uint8Array?
> 
> Though it also feels strange to accept one type and then return another, so
> maybe limit to Uint8Array?

I'd be fine with that restriction.

I think for the TextDecoder API we accept any type with the semantics I described (i.e. just consider the input type a byte buffer view, ignore the actual type), and ISTR there was discussion about moving away from consuming raw ArrayBuffers. We should probably evolve some consistency here.

I'll restrict our prototype to Uint8Array since it's easy to relax that later.
Comment 5 Joran Greef 2013-10-04 07:36:58 UTC
Thanks Joshua for filing the bug and getting discussion going.

Restricting to Uint8Array sounds like a good idea to start.

If it helps, one likely scenario for people using binary keys might be storing a few gigabytes in IDB, and there might be some kind of sync process between client and server, with an initial download sync to initialize the client. On top of that the connection would be binary Websocket and the key and value might be streamed to the client one after the other, i.e. fixed size key followed by value, within the same Websocket message.

In this scenario, if one had to pass the key in as a standalone Uint8array then that means slicing and millions of small objects being created/released and GC pressure especially for mobile devices.

Therefore it would be useful to be able to pass the binary key in as an offset and size into an existing Uint8array which might contain other data (i.e. the value itself), without forcing the end-user to have to slice that existing Uint8array first. If there's no offset and size argument when passing in the binary key, then the offset would be 0 and the size would be the length of the Uint8array.
Comment 6 Joshua Bell 2013-10-04 17:24:14 UTC
(In reply to Joran Greef from comment #5)
> 
> Therefore it would be useful to be able to pass the binary key in as an
> offset and size into an existing Uint8array which might contain other data
> (i.e. the value itself), without forcing the end-user to have to slice that
> existing Uint8array first. If there's no offset and size argument when
> passing in the binary key, then the offset would be 0 and the size would be
> the length of the Uint8array.

A Uint8Array is already a view onto an ArrayBuffer. If you have an existing large Uint8Array called |big| you can use:

var slice = new Uint8Array(big.buffer, offset, length);

... to specify a subset without making a copy.

Behind the scenes, an IDB implementation is likely going to make a copy of the bytes of the key, but a caller should be able to do:

store.put(big, new UInt8Array(big.buffer, offset, length));
Comment 7 Chas Emerick 2014-05-06 20:56:08 UTC
I hope this question is not mis-placed, but: where is this enhancement proposal likely to go from here?  Chrome has an implementation hidden behind a flag (http://codepen.io/cemerick/pen/twEgi), but I haven't been able to find much motion otherwise over the last 6-8 months.

Perhaps an easier to answer process question would be: is this likely to be accepted as an "incremental" change to the spec, or will an "IDB v2" specification have to be aggregated prior to reasonably expecting browser-makers to support the capability?  I can imagine the former happening over the course of months, but the latter is likely many years away, etc.
Comment 8 Joshua Bell 2014-05-06 21:17:26 UTC
(In reply to Chas Emerick from comment #7)
> Perhaps an easier to answer process question would be: is this likely to be
> accepted as an "incremental" change to the spec, or will an "IDB v2"
> specification have to be aggregated prior to reasonably expecting
> browser-makers to support the capability?  I can imagine the former
> happening over the course of months, but the latter is likely many years
> away, etc.

We've recently kicked off the process for "v2": http://lists.w3.org/Archives/Public/public-webapps/2014AprJun/0149.html

The likely spec process will be:

(1) general multi-implementer acceptance of an idea (i.e. no-one says "we'll never implement feature X')
(2) an implementer prototypes (or more) feature X
(3) a spec is written for feature X
[2/3 could happen in either order]
(4) multi-implementer agreement that the spec looks reasonable (i.e. it's not too specific to the implementation)
(5) repeat 1...4 for multiple features. Eventually call that "v2" and stamp it "done".

While "v2" must wait for 5 - and is probably on the order of a year at least, the emerging model is that browser implementers can "ship" a feature once (4) happens - no need to wait for (5). So once we really start converting bugs like this one into spec text, we can progress with 3/4 more quickly - hopefully "months" not "years".
Comment 9 Joshua Bell 2014-12-23 19:18:05 UTC
Re: accepted/returned types:

Web IDL has grown guidance here:

http://heycam.github.io/webidl/#idl-buffer-source-types

"When designing APIs that take a buffer, it is recommended to use the BufferSource typedef rather than ArrayBuffer or any of the view types."

"When designing APIs that create and return a buffer, it is recommended to use the ArrayBuffer type rather than Uint8Array."

Where BufferSource is:

typedef (Int8Array or Int16Array or Int32Array or
         Uint8Array or Uint16Array or Uint32Array or Uint8ClampedArray or
         Float32Array or Float64Array or DataView) ArrayBufferView;
typedef (ArrayBufferView or ArrayBuffer) BufferSource;

I'll update Chrome's impl to match that.
Comment 10 Jonas Sicking 2014-12-23 19:55:25 UTC
An important question is, how to we handle the multiple different types?

For example, is Int8Array([5, 9]) a different or the same key as Int16Array([0x0905])? What about Uint8Array([5, 9]).

This question applies both for matching as well as for reading data back. I.e. if you enumerate an objectStore, does the key contain exactly the type that was used to store an entry, or does it contain some normalized type?

See also comment 3 and comment 4.

I understand why WebIDL has the general recommendation that it has. But I'm still inclined to say that we should keep things simpler here.