POSITION PAPER: Z39.50 & Ranked Searching

Co-Authors: Dr. Chris Buckley, Chief Scientist, Sabir Research; Peter Ryall, Senior Architect, LEXIS-NEXIS

Access the Distributed Indexing/Searching Workshop Call for Papers using the Workshop URL.


Abstract

In the current universe of relevancy-based search & retrieval systems, there is a wide diversity of search methodologies, ranging from simple term occurrence/proximity algorithms, to modal, LSI, & connectionist logic, to full natural language processing. Across this spectrum there are many variations in query syntax, & in the degree of control given to the user and/or client over the exactness of the interpretation of search terms, as well as over the precision & comprehensiveness of the results selected from the target collection(s). Similarly, within the WWW community, a range of syntaxes exist for input of search query terms & criteria (various flavors of structured forms, fields allowing free-form query text, etc.)

The `Type 102 Ranked Query' currently under development for use within the Z39.50 Search & Retrieval protocol has been specifically designed to accommodate the ranked search technologies used by the majority of large-scale commercial information providers and Information Retrieval (IR) software vendors. The set of features specified within the standardized syntax of the Ranked Query is estimated to encompass the functionality supported by 80-90% of mainstream commercial ranked search technologies (including those in wide use across the WWW).

How the Z39.50 Ranked Query Facilitates Distributed Searching

Using the standardized Ranked Query, a consistent query & search term syntax can be used to send searches to multiple search systems, based on the following key elements of the Query:

Client/Server Interaction using the Z39.50 Ranked Query

When a client submits a Z39.50 Ranked Query, it has the option to instruct the server to reformulate the query to better describe the user's information need. The server modifies the query based on its knowledge of the collections it is searching, the vocabularies native to those collections, general linguistics, & the most effective expansions of the query terms as related to the desired precision & comprehensiveness specified by the client. If the client has so requested, processing can stop here, & the reformulated query is shipped back to the client for further modification by client and user.

The session-oriented `state-ful' nature of the Z39.50 protocol facilitates the following types of client-server interactions using the Ranked Query:

Z39.50 Ranked Query increases Client Control over Query Processing

The Z39.50 Ranked Query gives the client more control over processing & evaluation of the query:

Z39.50 Ranked Query allows the Server to Return Postings Information

Because of the less predictable & deterministic nature of relevance based searching (as discussed above), a search server may perform query modifications or complex processing which is unrelated to what was specified in the user query. Although a client has more control over Z39.50 Ranked Query processing, the whys & wherefores of server query reformulation are still quite difficult for the user/client to understand.

Thus, an important feature of the Ranked Query is the ability for the server to return search result demographic meta-data (often referred to in the IR industry as `postings' data). The format & content of this data is also standardized within the definition of the Ranked Query, making it easier to interpret `postings' data from many different types of search systems.

 

This page is part of the DISW 96 workshop.
Last modified: Thu Jun 20 18:20:11 EST 1996.