JW5: Partial Results from Babich, Alan on 1999-06-24 (www-webdav-dasl@w3.org from April to June 1999)

From: Babich, Alan <ABabich@filenet.com>
Date: Thu, 24 Jun 1999 15:02:42 -0700
To: "'DASL'" <www-webdav-dasl@w3.org>
Message-ID: <C3AF5E329E21D2119C4C00805F6FF58FE66B1F@hq-expo2.filenet.com>
Once upon a time, we discussed what we called "paged results".
Somehow, we dropped the ball, and nothing made it into the spec.
Oops. Here is what we were thinking. This was in an e-mail I sent
to Dale Lowry:

In section 14.36 of the HTTP/1.1 spec. (page 128),
they talk about ranges. The only type of range
they define is byte ranges. You use a range request 
header that begins "Range:" in the GET method.

Our thought was that we should try to fit in with
existing HTTP stuff before inventing new stuff,
and that would maximize our changes of success.
So, we thought we could probably invent a new
kind of range for DASL answer sets. Obviously,
byte range is not what we want. We want a 
granularity of a "hit" or "match set entry",
so we have to invent a new kind of range.

We think that if the server passes back a
special string as in your proposal, that the
server could (a) continue the original query as
opposed to someone else's query or reexecuting
the query, and (b) continue returning results
from the point at which the client left off. We didn't
discuss jumping around in the result set, either
backwards or forwards. For the basic feature,
the client just needs to sweep the answer set once.
Jumping around in the results is more advanced.

We want to leave it open for the server to be implemented
using any of the following three implementations, as
well as other possible implementations:
(1) Redo the query each time, each time discarding enough 
initial hits to provide the illusion of progressing
to the next page in the result set of the original query. 
(2) Get all the hits at once and cache them on the server. 
Return the next page of hits from the cache upon client
demand. (3) Develop the next page of hits upon demand 
from the client.

Implementation (1) is the dumbest, and you may get
annoying consequences. For example, suppose you did
a query and the results are supposed to come back
in increasing order of Author. If there are concurrent
deletions, then it could happen that the client sees
a sequence of hits that slips backwards in the
ascending sequence of Author one or more times,
because some of the previous hits are repeated. This
is OK as long as you tell the client that the result
set entity is different each time. (For example, e-tags
can tell the client that.) The other two implementations
don't have this problem.

Implementation (2) means the client has to connect to the 
same cache each time, and that the client has the same 
result set entity each time. The cache has to be garbage 
collected if the client dies before sweeping it. The 
implementation will probably impose some arbitrary upper 
bound on the number of hits to avoid wasting server 
resources.

Implementation (3) means the client has to connect to 
the same server process each time (because that process
has a database or DMS system handle needed to develop
the next page of hits). The requesthandler process
has to time out and go away if the client goes away.
This implementation may provide the least drain 
on server resources.

We thought that we could probably modify your proposal
in a way such as to accomplish all of the above
objectives.

Alan Babich
Received on Thursday, 24 June 1999 18:01:44 UTC