Web Data APIs

We've been looking at the web database space here at Microsoft, trying to understand scenarios and requirements. After assessing what was out there we are forming an opinion around this. I wanted to write to this group to share how we think about the space, what principles we try to apply, and to discuss specifics.

The short story is that we believe Nikunj's WebSimpleDB proposal, which basically describes a minimum-bar web database API and enables a whole set of diverse options to be built on top, is the right thing to do.

During the last couple of weeks we have been talking with various folks from Mozilla and Oracle and iterating over details of the WebSimpleDB draft. In the process it has become clear that we all share the same high-level expectations on the scope and capabilities of this API, and Nikunj has been hard at work making changes to the draft to keep up with them. I'll touch on a few details below, but bear in mind that several of them are already in the process of being addressed.

We would love to hear feedback, requirements, specific application scenarios, etc. We want to make progress quickly and get experimental implementations going to ensure that as we explore we stay grounded, with things that are implementable.


Guiding principles and why we think the ISAM style proposed in WebSimpleDB is a good idea
As we try to understand the problem space we formulated a couple of guiding principles:
- Get into the standard the key building blocks that are either impossible to build on top, or so common that would be very redundant to do so
- Focus on an API that is simple enough that can be reliably specifiable and that can be implemented to follow the spec in a relatively simple manner
We believe that WebSimpleDB sets the stage in this direction. An ISAM layer can be used directly or can be a building block for more elaborate layers that can be built entirely in Javascript on top. Also, ISAM is simple enough that can be specified in a way that should enable highly interoperable implementations.


Trimming down

There are a number of elements of WebSimpleDB that we can probably live without, at least for a first version, such as Queues and Sequences. This may help simplify the database API even further.

Also, there are a few simplifying assumptions we can make from the get-go. For example, that "paths" as informally mentioned in the spec only reference Javascript identifiers (perhaps with dot-notation) and when used for index/primary keys they point to Javascript primitive values and not to objects/arrays.


Terminology

The word "Entity" has a lot of different meanings depending on who you talk to. It would be interesting to find a simpler term, perhaps something that matches the Javascript terminology better.


Areas where we need to dig deeper and have broader discussions to understand better

Isolation model and its implications in locking: Various isolation models lead to different failure modes; for example, regular locks mean that application code needs to be ready to deal with deadlocks, or in the case of multi-versioning you can see optimistic concurrency violation exceptions during commit. There is a tricky balance between not dictating too much from the implementation and ensuring that observable behavior across implementations really enables interoperability.

What's the sweet spot for the API?: is the primary use for this API to be directly consumed by application code? Or is it a building block to create various different libraries that present a diversity of styles for query formulation and execution? We lean to the side of making it an API that's great for libraries to build nice layers on top, but it's still useable directly in application code (along the lines of what happens with XmlHttpRequest, where most developers will actually use a wrapper that fits the particular scenario/library better).

Regards,
-pablo

Received on Saturday, 31 October 2009 20:18:22 UTC