Session V Plenary Notes ----------------------- Clifford Lynch (breakout chair report) - Not much distributed search currently going on mostly competing centralized searching - we mainly consider geographically distributed searching initially - we need work on vocabularies and taxonomies suggested that we re-visit the distributed db work of the 80's someone should come up with crisp language on models and architectures - (within the next year) - solutions should not penalize sophisticated users Standards: merged rank results - what data is needed duplicate detection search result presentation and management (characterize 10,000 hits, for example) suggestion - what lessons can be learned from IBM Info Market Ray Denenberg (breakout chair report) List of issues (see his slide) Short term and long term focus areas (see his slide) Ken Weiss (breakout chair report) Who will win? Both centralized and distributed engines will co-exist. Distributed searching will be more specialized, domain specific collections Short term/ Long term list (see slide) W3C Announcements ----------------- 1) June 20,21 - PICS System 2) Digital signatures - {jmiller,khare}@w3.org 3) working group on intellectual property and collaboration - W3 Consortium BOF Reports ----------- Spidering 1) robots meta tag - Al, None, No_index, No_follow 2) keywords - comma separated list of phrases 3) description - user controlled summary for searching Others: a) ambiguity in robots.txt - www.koller.com/robots.html b) site canonicalization c) robotn.txt - last change information d) please visit facility e) flow control - specify a retrieval interval A discussion ensued regarding grouping mechanism for metadata info Stuart Weibel championed these ideas in the following days sessions. Session V - Standards directions -------------------------------- (The following ideas were the few significant statements I recorded) Clifford Lynch we need to differentiate between collection description and open index information. expressed skepticism about open index information since it assumes extraction consensus, which does not exist, in his opinion. Engine identifier - could include query syntax and result format Ron Daniel - suggested it should be a URL Large discussion of Z39.50 Discussion regarding a Z39.50 lite Clifford Lynch - it needs to be determined if we mean a simpler general format or a specific type of data (GILS, for example). People need an easy migration path to Z39.50 An easy path to organize virtual communities Shirley Browne - examples are the HPCC and NA communities