Proposal for an Information Locator Service
Prepared for the Distributed Indexing/Searching Workshop at MIT, May 28-29, 1996
Information needed to locate other information takes many forms, and no single access mechanism can be optimal for all applications. This proposal is to define a simple and generalized Information Locator Service to be supported in addition to other protocols such as HTTP, Whois++, gopher, and LDAP. Clients could search across compliant servers to obtain all manner of locator information, including the characteristics of other Internet information resources. Even servers that support high-performance applications such as name resolution could separately provide search access to the metadata maintained.
The Information Locator Service would adopt existing international standards such as a minimal subset of those adopted by the Government Information Locator Service (GILS) Application Profile. (GILS itself is in U.S. law and policy at Federal, state, and regional levels; internationally in countries such as Canada, Japan, Australia, and the United Kingdom; and in intergovernmental initiatives such as the G7 Global Information Society.)
The GILS Application Profile, approved internationally in May 1994, adopts some Internet RFC's and a subset of ANSI Z39.50-1995. Z39.50 does begin to address the handling of multi-lingual information, supports various security and other arrangements for fee-based and free dissemination of information, and has been implemented in either a stateless or stateful mode of operation. The GILS Profile defines only the behaviors of compliant servers--clients are unconstrained and can range from simple user interfaces to sophisticated software agents.
There are commercial GILS-compliant servers as well as freeware implementations for all popular server platforms. These GILS-compliant servers are serving many kinds of information resources with wide variation of structure, from HTML to USMARC files, as well as relational and Postgres databases. Gateways exist for Web browser access in addition to standalone or browser add-on clients that use the search protocol directly.
Because the GILS Profile adopts a subset of the ANSI Z39.50 standard, GILS-aware clients can already freely search hundreds of professionally maintained resources such as library and spatial data catalogs collectively valued in the tens of billions of dollars, with much more available on a fee basis. There are also hundreds of WAIS databases freely accessible, and thousands more WAIS databases maintained behind HTTP servers.
A compliant server appears to a client as though holding a searchable set of information locator records. Each locator record can characterize other information of any kind, at any level of aggregation, and includes URI's and MIME types for Internet resources. For example, a locator record that describes another server might include a listing of the words most characteristic of that server's contents and so act as an intermediary resource for information discovery.
Searches can be content-based using full-text searching or other manner of feature extraction. Or, the search may take advantage of structured attributes such as well-known elements and relations, though it is not necessary to have a canonical format for structured metadata. Natively or through gateways, the service can support search of many different metadata structures--HTML, SGML, X.500, SQL databases, PURL's, Handles, Dublin Core, SOIF's, IAFA, Internet mail, DIF's, Whois++ templates, spatial metadata, etc. Whenever appropriate, servers simply map local semantics to registered attributes, and the attribute registry itself is extensible through an established process.
Acknowledgments, References, Topics in Software Implementations