:60_3  Dq9G SHHJSJSd3WF 3/Wo q7y7y7yU20_3 GLGTGXG8GH Ce9G SHHJSJSd2^2Bq333U  xHHJSJS[(HHdd'@ Overview of FIND  F F 4H`FIND is a keyword based information retrieval system which runs on the CERN IBM VM/CMS system. It allows access to a large number of documents including large documents, newsletter articles, and help pages. There is also a database put together with a lot of work which contains all people at or around CERN including their user names on various machines, phone numbers, etc., The stages in the operation of FIND are The author of a document codes some information about authorship, keywords, and category to the top of a document (or a separate file); There is a list of minidisks (directories) on which such documents are stored.  Overnight, a batch program runs and builds an index of all the words used as keywords or in title or author information An enquiring user gives a list of keywords. The index file then very rapidly produces a list of matching documents using an optimized search algorithm. Any unsuccessful searches are noted, and someone has the responsibility for extending the keyword associations so that searches which have previously failed will in future give suitable documents.  An important aspect of the system is the philosophy that the `customer is always right': the keywords are manually altered to reflect those used by the reader, rather than being those suggested by the writer. FIND is impressive in its speed, even when coping with of the order of a million keywords. Limitations are l Z 4H` The availability, currently limited to VM/CMS, which would be cured by a hypertext gateway; The quality of the documents registered. Any information gets a bad name if the information in it is out of date or poor. This is not, however, a function of the system. The user feedback attempts to reduce this. An annotation scheme could perhaps increase the user feedback. , The quality of the keyword associations. Authors are, in general, not good at thinking of keywords, and not inclined to. Keyword synonyms are not handled by FIND. I would suggest that see also links between related keywords be added. These would be representable in the hypertext model, and would allow a search to be broadened when the initial keywords fail. This system is normally essential in a yellow pages telephone book, for example, and especially for readers of a different mother tongue. [One could imagine multilingual keywords as well]. # # 4H` FIND Concepts F F 4H` Document  i i 4H` A find document obviously maps onto a hypertext node of type document. In FIND, a document has:-  a category, such as news, writeup, etc., a short name, a one-line title, a list of keywords, an author, a person responsible (not always the original author), creation and last modification dates, and, for writeup category only, information about source and printing. The name, title, dates and content information all map onto the equivalent information for a node. The category, keywords, author, and responsible map onto links to other types of nodes. F F 4H` Category i i 4H` The category of the document describes to the user what sort of a thing it is. In the case of a document in the writeup category, it tells the system that the is a machine processable document in a markup language. This latter aspect probably maps onto the node format of a hypertext node. Otherwise, the category could be regarded as either a node type (or subtype?) or a node corresponding to a group of documents (a composite node). This latter would allow one to access the category as an entity in itself, and to perform searches from it. This seems the best representation. > F F 4H`Author  i i 4H` The author of a document maps on to a node of type person, linked by a link of type A made B. An interesting feature of the WHO database is that allows one to verify a CERN author's uniqueness. In fact, it might useful to insist that any person added as an author (so long as they are at CERN) is found in WHO, and store a link to the WHO database. This would be a good check that the correct person is identified, and allow a hypertext link directly to other information about them. x The mapping of FIND onto hypertext must therefore map the structure into composite nodes, and A includes B links. Ideally, for human readers, a composite node contains about a screen full of information. Therefore, it might be necessary to tailor the composite node generation in different ways. After all, a category with 10 documents in might make an ideal group, but one with 2000 documents would be less useful and better split into subgroups. For example, whereas for newsletter articles the newsletter might be appropriate as a grouping, experiments might have their own groupings indicated by their own categories. <# # 4H`Maintaining the web  F F 4H` There might therefore be a case for hand-crafting a basic web, and then filling it out with FIND information. In fact, this might be a good way of managing keywords and categories: FIND could in fact scan a simple hypertext database for the links between keywords, etc. It would be good to make the update of this web as easy as possible. This updating includes the work which is done by an administrator in response to user feedback, and should be slick and simple to do. 2yWNGraphic.204851.eps yV5 F F 4H`Responsible  i i 4H` As author, but perhaps a new link type is needed A is responsible for B  F F 4H`Keyword  i i 4H`  A keyword also maps onto a hypetext node. A keyword node has links to all the document which have that keyword in their keyword list. The text of the keyword node is just a list of those links. It is possible that we could incorporate some comment on particular keywords. I also suggest links to synonymous and related keywords. In some cases, a keyword will also be the name of a document node. In this case, the document and the keyword are not different nodes. The document provides the textual comment on the keyword. The keyword links to other douments are links between the two documents, of type See also, for example.  F F 4H` WNGraphic.481834.eps  Fig 1. A graphical representation of part of a hypertext web generates as a view of the FIND data.    # # 4H`How will the user see FIND through hypertext? F F 4H`   4{WNGraphic.894338.eps Fig 2. An (inexpertly drawn) screen for an imaginary browser. This is the initial browser screen, when started with no other information. This might be the result of a user simply typing FIND. This is a virtual hypertext node which is generated to give access to a list of categories. The introductory text has been written specially: it is notpart of any document in the FIND scheme. This node does not represent a document: it is a composite node representing a set of category nodes.  When the mapping is done, the user will be able to access FIND documents through a regular hypertext browser. Suppose an initial command line invokation of the browser gives one, by default, a node with a set of all FIND general categories  (Fig. 2) . To search by keyword, the browser would have to provide a search by name feature, for example by a panel  (Fig 3) .   Fig 4. The imaginary browser at node corresponding to a FIND document. The top one of the three sections of the screen is concerened with the running of the browser. The bottom section is a rendering of part of the text of this node. The center section shows information about the node: its title, and the links from it. On the left of this section are link types listed, and against each the title of the destination of each link. Each mouse-sensitive part (anchor) has a box around it in this example. 2zWNGraphic.126571.eps Fig 5: An author is also represented as a node, and one can browse through an author's liost of publications just as one can browse through a publication's list of authors.   # # 4H`Hyperizing `FIND' X X 4H` # # 4H` (DRAFT - Tim Berners-Lee 31-2-12. Info from Bernd Pollerman) i i 4H` # # 4H` Keywords: Hypertext, Information management, Keywords, FIND, mapping, documentation retrieval. See also:  Information Management: a proposal , and  Hypertext Design Issues. Here we discuss a possible mapping between the existing FIND information, and a generic hypertext model, with the aim of making the FIND information available network-wide to a generic hypertext browser.  This is a much-disputed point, being a subject of personal taste, and application area, among ohter things. The best we can do is to provide for the possibility but not enforce it. $ $d Hyperizing FIND p Printed  ______________________________________________________________________________________________ _# # 4H` Hierarchical Browsing F F 4H` There is sometimes   a need for users to be able to see the structure of the documents when one exist. They often ask for a hierarchical structure to be available. In many cases, there is a natural sequence to documents such as newsletter articles, or newsletters. These can be represented in hypertext using composite nodes. One would imagine the set of newsletters, an individual newsletter, and the articles in that newsletter, being successive levels in a composite node structure. (  WNGraphic.595512.eps Fig 3: A possible search panel. The buttons allow one to filter the nodes by node type. It might be useful to also allow some control over depth of search and degree of match.  The search panel on a hypertext browser will of course be independent of the database. However, the scope of the keyword search will be determined by the server's index. From the hypertext point of view, the keyword search operation starts with a serach for each keyword node by name. At this stage, any unknown keywords can be rejected. Then, the set of keyword nodes is used as the starting point set for a N-way search.This returns a set of found nodes, which are put into the user's pot of nodes marked for interest. D As an alternative, he could browse through the hierarchical structure, or browse through by author. In each case, the browser would read textual nodes with anchors in. Figure 4 shows a node representing a document, and Figure 5 one representing an author. In this case, we asume that the WHO database is also incorporated to provide details of people. The node shown is the merging of data from the document and WHO databases. This should be done automatically by a browser when an identity link is found between te two, or in this case it could be done by the FIND gateway. d$'nrl HlD) TL -e N`D mrlfdeWd( D&0xd[f $ e$H%*k$*`-]T!0 5$!3#;5#\$ $ 88d'nF$& This is a much-disputed point, being dependent on personal taste and application area, among other things. The best we can do is to provide for the possibility but not enforce it. d$'nrl HlD) TL -e N`D mrlfdeWd( D0{0xd[f $ e$H%*k$*`-]T!0 5$!3#;5#\$ $ $0{  This is a much-disputed point, being dependent on personal taste and application area, among other things. The best we can do is to provide for the possibility but not enforce it. d$'nrl HlD) TL -e N`D mrlfdeWd( D20xd[f $ e$H%*k$*`-]T!0 5$!3#;5#\$ $ $2(Transfer format  The SGML-style transfer format described in  Design Issues could be used by an adapted FIND program to output marked up information to a hypertext browser. An example of a the conversion into hypertext might be a text file such as  Documnts found with keywords COPY TAPE VMS IBM The following 20 documents have these keywords. Minutes of the HGQI meeting of 21-July Minutes of the HGQI meeting of 14-July Minutes of the HGQI meeting of 2-July

The CMS COPYTAPE command  and so on. d$'nrl HlD) TL -e N`D mrlfdeWd( D20xd[f $ e$H%*k$*`-]T!0 5$!3#;5#\$ $3$