Advanced Search Facility for Federal Documents

Creation of an Advanced Search Facility for Federal Documents on the Internet requires improvement of existing search engines. This requirement stems from the different user communities such a system must service. The first user community is composed of individuals who may have only a vague idea of what they are looking for. The second community is composed of individuals that are looking for a specific document or piece of information. Both groups require better ranking of result sets to focus the selections presented by these systems.

Most current search facilities do not present the results of a query in a form that is useful to either groups of users. The problem is in the order that documents are presented. Often commentary or discussion about an information resource rather than a pointer to the resource itself is returned. An Advanced Search Facility for Federal Documents must allow the user to specify the ranking criteria to be used to determine the order of presentation. Original documents must have a higher rank that commentary or even "pointer pages."

Metadata such as described in the the Government Information Locator Service (GILS) profile can be used to provide the additional guidance needed to properly rank results returned from a query. The GILS metadata take the form of locator records which can be automatically created by the indexer software then exported for editing by humans. Additional metadata could be added and the locator records served for use by serarch engines in locating and ranking documents. These locator records would add sufficient information to allow the search engine to identify original documents. Existing GILS servers could provide locator records to the system via Z39.50 protocol.

The use of locator records served via Z39.50 protocol will allow other systems to provide information for the ranking of query results. The Federal Geographic Data Committee (FGDC) clearinghouse, for example, could provide information about spatial data and documents because it's application profile has GILS as a subset. Those GILS servers operated by all Federal Agencies would provide an additional source of information. Federal Agencies are required to create and maintain GILS records about their programs. Documents that contain Dublin Core Metadata would assist the automatic creation of the locator records.

An Advanced Search Facility must also understand place and time as searchable attributes. These are necessary to answer a question of the form: "What small business opportunities are available now near Atlanta Georgia." The GILS profile and Dublin Core both support a bounding rectangle that can assist in answering this question. With the minimum bounding rectangle, greater use can be made of existing locator systems like the FGDC Clearinghouse.

An Advanced Search Facility, when built, will gather and index all Federal Government information on the internet in a distributed fashion. Those servers capable of running the indexer will make their indexes available to other systems on the net. The index information will be made available using Z3950 v3 (1995) protocol and perhaps LDAP as well as protocols internal to the system. Recent tests have demonstrated that GILS compliant metadata can be searched and served using LDAP protocol as well as Z39.50.

This page is part of the DISW 96 workshop.
Last modified: Tue Jul 9 17:19:02 EST 1996.