Three issues have been persistent in these and other projects, when trying to build services which `cross-search' multiple datasets, query protocols, and search services:
We will touch upon each of the above issues in the following sections.
Some important requirements related to this are:
A query language that has to fulfill the role of a universal front-end to a variety of other query languages and protocols typically faces the following issues:
The distributed nature of the Web presents a challenge for building usable and intuitive resource discovery services. Deployment experience with large-scale search services (eg. [9]) suggests that new mechanisms are required for more effectively managing distibuted searches. If we want to construct systems in which a user enters a single search expression and has that request satisfied by a number of searchable databases, it is essential to have "forward knowledge" about the contents of those databases. Simply broadcasting all queries to multiple databases will not scale.
There are several types of "forward knowledge" which may contribute to a more scalable architecture for distributed searching. This data can be used in a number of scenarios; a common approach is likely to be the "referral" mechanism as used in the WHOIS++ and LDAP protocols. A "referral" is an additional component of a search result which informs the search client about alternative databases that could yield relevant results. An alternative scenario involves a central index server or broker that gathers forward knowledge for multiple databases, redirecting search clients to the most appropriate target(s).
Forward knowledge requirements for effective query routing include the following issues. These are largely independent of the choice of query language, but nevertheless form a crucial component of any distributed search system:
[1] | Development of a European Service for Information on Research and Education (DESIRE), http://www.desire.org/. |
[2] | TERENA CHIC-Pilot project, http://www.terena.nl/projects/chic-pilot/. |
[3] | P. Valkenburg, D. Beckett, M. Hamilton, S. Wilkinson, Standards in the CHIC-Pilot Distributed Indexing Architecture, in: Computer Networks and ISDN Systems special issue "Proceedings of the TERENA Networking Conference 1998", http://www.terena.nl/libr/tech/chic-fr.html. |
[4] | Resource Organisation and Discovery in Subject-Based Services (ROADS), http://www.ilrt.bris.ac.uk/roads/ (project), http://www.roads.lut.ac.uk/ (software). |
[5] | D. Brickley, R.V. Guha, A. Layman, Resource Description Framework (RDF) Schema Specification, W3C Working Draft 30 October 1998, http://www.w3.org/TR/WD-rdf-schema/. |
[6] | TERENA CHIC-Pilot Deliverable D3.1: Search Profile Based on WHOIS++, http://www.terena.nl/projects/chic-pilot/deliverables/D3.1_draft.html. |
[7] | Version 2 of Application Profile for GILS, http://www.gils.net/prof_v2.html. |
[8] | L. Gravano, K. Chang, H. Garcia-Molina, C. Lagoze, A. Paepcke, Stanford Protocol Proposal for Internet Search and Retrieval, January 1997, http://www-db.stanford.edu/~gravano/starts.html. |
[9] | Chris Rusbridge,Towards the Hybrid Library, D-Lib Magazine, July/August 1998. http://www.dlib.org/dlib/july98/rusbridge/07rusbridge.html |
[10] | Jon Knight, Dan Brickley, Martin Hamilton, John Kirriemuir, Susan Welsh. Cross-Searching Subject Gateways: The Query Routing and Forward Knowledge Approach. D-Lib Magazine, January 1998. http://www.dlib.org/dlib/january98/01kirriemuir.html |
[11] | The Architecture of the Common Indexing Protocol (CIP), Allen J., Mealling M., works-in-progress of the IETF Find working group. http://www.ietf.org/ids.by.wg/find.html |
[12] | Jim Miller (ed.), Paul Resnick, David Singer, Rating Services and Rating Systems (and Their Machine Readable Descriptions) Version 1.1, PICS Working Group, W3C. http://www.w3.org/TR/REC-PICS-services |
[13] | Emma Worsfold, Subject gateways - fulfilling the DESIRE for knowledge, Computer Networks and ISDN Systems (Vol 30 Numbers 12-18) 30th Sept 1998). http://www.desire.org/html/research/publications/tnc98gateways/(preprint url) |