Brewster Kahle

(Brewster is the author the the WAIS system)

About UDIs

Brewster and I were in general agreement about the criteria for names and addresses (UDIs) in information space. We had come to different conclusions about a few details, and the actual syntax.

The most superficial difference is that Brewster's UDIs have the most significant bit on the left (like mail addresses), and mine have it on the right (like numbers and filenames). We didn't spend much time on this arbitray choice. But it would be convenient to agree.

We talked about the relationship between names and addresses. (A name supports equality opertor and is unique and lasting; an address indicates a particular access method). Brewster felt that addresses should always have a name attached to them, so that one could always verify the thing they pointed to was the right thing. Therefore, he proposes a syntax with a (name, address) pair. I suggeted that any name may correspond to many addresses, so wheras it might be reasonable to insist on the presence of the name, one should allow a list of addresses.

The W3 idea of making the name of a result set the name plus the query was new to Brewster, and he liked it. This opens the quesry language can of worms in the nice clean UDI camp.

WAIS had originally started with common LISP notation for UDIs, a notation which it still uses for the WAIS "source" files. WAIS later moved to something looking more like a mail address. This was partly because Brewster was simply impressed with the way mail addresses work so well.

A single simple quoting scheme must be defined for UDIs ideally.

UDIs ideally have a distinct structure in which parts understandable by the client, server and application are distinct. The WAIS client allows relevant feedback to be made to a portion of a document only, in which case the client/application builds the anchor (in practice a range) and the server has to undersnand this. (In fact, the server just operates in client mode to retrieve the relevant data, so both sides are in fact clients.)

NB: Commercial world interest in the UDI format is possible from AMIX, Xanadu (?!), NeXT, Pandora.

The commercial viewpoint

It was interesting to contrast the WAIS view with the Library 2000 view. Whereas library projects are looking for naming schemes and data models which will endure thousands of years, Brewster had his eye on information as a billion-dollar business in a few years' time. People don't pay money for old information, so concerns about the persistence of naming scshemes beyond the lifetimes of networks are not of concern. Servers may not be willing to provide lists of alternative sources for their own information, as they charge and make a profit per access. This difference of viewpoint is liable to lead to architectural differences between the systems, unless we can generalize at this stage.

Other random points

Points occuring during discussions, or of which I was reminded:
Feezing queries
make it obvious from the UI what document is a "live" query and what is a frozen copy. The same UI metaphor could be used for "live" documents and archive copies. It is important to know which you're linking to.
Date information
Brewster found that the only extra functionality he had had to add to text input to queries (and relevance feedback) was date information. People want to query by date. Other (boolean, SQL, etc) forms of query had not been necessary. This is interesting because date infomation is also needed for machine-machine interaction for news (a la NNTP) and cache updating. (If versioning is supported as a then obviously that ties in with dates.