W3C Distributed Indexing Workshop: RDM/SOIF

Darren Hardy <dhardy@netscape.com>
Netscape Communications Corporation
May 6, 1996

As part of the Netscape Catalog Server project, Netscape has adopted and extended the Harvest distributed indexing technology via a mechanism called Resource Description Messages (RDM) which uses SOIF as its underlying syntax, and HTTP as its underlying transport protocol.

What is SOIF?

Harvest's Summary Object Interchange Format (SOIF) is a syntax for transmitting resource descriptions (RD) and other kinds of structured objects. Each RD is represented in SOIF as a list of attribute-value pairs (e.g., Company = 'Netscape'). SOIF handles arbitrary textual and binary data as values, and with a simple extension handles multi-valued attributes. Also, SOIF is a streaming format which allows many RD's to be represented in a single, efficient stream.

What is RDM?

Resource Description Messages (RDM) is a mechanism to discover and access Resource Descriptions (RD) (or metadata) about network-accessible resources. RDM is implemented as a layer on top of HTTP, giving it the ability to leverage off of existing HTTP-based technology, and uses Harvest's SOIF technology to exchange indexing information over the network incrementally and efficiently. In addition, RDM supports a Schema which describes the SOIF, such as attribute names, data types, content types, and other information. RDM also supports a Server Description which describes some of the vital statistics about the RDM server, and provides a brief description the content of the server itself (i.e., with some sample RD's and a human-generated description). Finally, RDM supports a flexible scoping/view mechanism to access or search the RDs in a query-language independent fashion.

How is RDM/SOIF used?

RDM supports the Harvest Broker/Gatherer architecture. The Broker uses RDM to retrieve indexing information from a Gatherer; and an end-user search client uses RDM to send a query to a Broker and to retrieve the query's result set.

A Gatherer exports its Resource Descriptions (encoded in SOIF) via RDM to Brokers or other search engines interested in its indexing information. Typically, an automated Robot is co-located with the Gatherer to generate the indexing information for a collection of WWW servers.

Brokers or other search engines can use RDM to contact a Gatherer and incrementally download Resource Descriptions (encoded in SOIF). Also, if desired, Brokers can use RDM to download the schema or server descriptions from the Gatherer to customize their indexing algorithms.


This page is part of the DISW 96 workshop.
Last modified: Thu Jun 20 18:20:11 EST 1996.