The Library Perspective

QL 98 Position Paper

Ray Denenberg
Library of Congress
18 November 1998

In choosing a query language for the web, it is wise to consider the perspective of the library community. There are a number of reasons, but paramount, pragmatically, is to ensure the widest possible access to the vast information resources managed or supplied by libraries and related institutions. This includes more than just bibliographic records; geospatial and medical information (it has been estimated that more than half of all Internet searches pertain to either geographic location or health) as well as information from various other communities and disciplines, including chemical, biological, genetics, government, and museums is accessible via the Z39.50 Information Retrieval protocol. The aggregate value of information accessible via Z39.50 is difficult to assess accurately, but according to one estimate ranges from 10 to 100 billion dollars (US).

So it makes economic sense fot a web query language to provide query formulation in a manner syntactically compatible with Z39.50. This does not preclude XML-based encoding, stateless protocol, or HTTP encapsulation (these three Z39.50-interoperability issues seem technically solvable).

Z39.50 may potentially interoperate with an (otherwise compatible) XML-based protocol provided the notation used to describe the data structures is compatible with that of Z39.50. Thus the query language should be described using ASN.1. This is a good idea, not just for interoperability with Z39.50. ASN.1 provides richness of expression, but it also provides the necessary concision and rigor to bind expressions to registered semantics. Moreover, ASN.1, as an abstract syntax notation for describing structures, is not bound to a specific encoding. Thus, ASN.1-described data structures may be encoded in XML. This decoupling of abstract syntax from encoding should be a required feature of any syntax notation.

Librarians have pondered issues involving searching, and information retrieval in general, for decades, and intensively over the past 15 years during the development of Z39.50. In the process, the library community has engaged a number of other domains and disciplines in these discussions, including museums, chemists, and biologists, to name a few. Thus the library view attempts to reflect the view of the information retrieval and research community in general.

Although the workshop scope may be limited to query language, in the long term a comprehensive information retrieval protocol will ultimately evolve as more and more capabilities are demanded. In addition to query capabilities, other potential capabilities of an information retrieval protocol include:

All of the features in the list above, and many more, are supported in Z39.50. Based upon the library community experience in considering and developing these features, we predict that most will eventually be demanded by web users. It makes sense that as much as possible of this existing development be leveraged into a potential web information retrieval protocol, and it follows that maintaining compatibility with Z39.50 from the start is a sound long-term strategy.