Combining Browsing and Searching

(A position paper for the W3 Distributed Indexing/Searching Workshop, MIT, May 28-29, 1996)

Browsing and searching are the two main paradigms for finding information on line. The search paradigm has a long history; search facilities of different kinds are available in all computing environments. The browsing paradigm is newer and less ubiquitous, but it is gaining enormous (and unexpected) popularity through the World-Wide Web. Both paradigms have their limitations. Search is sometimes hard for users who do not know how to form the search query so that it is limited to relevant information. Search is also often seen by users as an intimidating ``black box'' whose content is hidden and whose actions are mysterious. Browsing can make the content come alive, and it is therefore more satisfying to users who get positive reinforcement as they proceed. However, browsing is slow, very time-consuming, and users tend to get disoriented and lose their train of thoughts and their original goals.

We argue that by combining browsing and searching, users will be given a much more powerful tool to find their way. We envision a system where both paradigms will be offered all the time. You will be able to browse freely -- the usual hypertext model -- and you will also be able to search from any point. The search will cover only material related in some way to the current document. (Of course, global search may also be offered.)

As a first attempt to test this notion, we implemented a system that automatically modifies existing WWW pages to add search facilities such that the search domain from a given page depends on that page. Our system, called WebGlimpse, is based on our Glimpse and GlimpseHTTP search facilities, but it adds the concept of a neighborhood for each page. The neighborhoods are computed (based on options selected by the information provider) at indexing time (e.g., once a night). Only one index is used per site, but the index supports efficient search by neighborhoods. A typical neighborhood may be the list of all pages linked from a given page, all pages within distance 2, all pages in the same site, all pages pointing to that page, etc. WebGlimpse copies remote pages as well and automatcally adds them to the index. More complex definitions of neighborhoods, which may depend on semantic analysis, can also be added.

In summary, WebGlimpse allows any web site to offer a combination of browsing and searching by automatically analyzing the site, computing neighborhoods, and attaching search interfaces to existing pages. The search is efficient both in terms of time (neighborhoods are explored only at indexing time) and space (only one small index per site).

WebGlimpse Team: Udi Manber, Burra Gopal, and Michael Smith,
Dept. of Computer Science, University of Arizona.

This page is part of the DISW 96 workshop.
Last modified: Thu Jun 20 18:20:11 EST 1996.