Machine aided construction of hypertext bases

Rainer Kuhlen

This was rather badly presented, though I got some ideas here: simple grammars may help in reading and identifying parts of texts that are important for linking, and therefore relatively simple tools can be built to transform classical, linear texts into hypertexts, provided the author has been reasonably good at expressing himself. As it became clear to me (RC) during the conference that dynamic construction of links is necessary, it will indeed be very useful to have such tools, however imperfect, that help find important text sequences. Two problems were addressed:

Conversion is defined as the process which allows segmentation of the text into coherent units in three stages:

  1. identification of coherent units
  2. reconstruction of cohesive boundness (semantically closed units are sought)
  3. transformation / integration of the units.

    Information units have the following structure:

    1. Contents:
      • label (name or title of the unit)
      • conceptual reference (index term)
      • condensation reference (abstract)
      • informative part
    2. Functions:
      • information functions (links, paths, orientation means)

    It was stressed that conversion should not imitate, but add value for the new medium. As this means understanding the subject matter, and therefore is a cognitive process, it can today only be done wiht the help of a human author. However, artificial intelligence techniques using the concepts of frames and inheritance can help a lot. Even simple context-free grammars to specify noun-groups can be succesfully used to identify for example the label part of an information unit.

    In well-structured linear texts (and we should hope that most texts worth converting are also well-written) syntagmatic relations (next-passage, previous-passage, ...) can be detected fairly easily.

    Paradigmatic relations provide more information and can be detected using knowledge bases (share-concept, have-same-features, have-same-info, ...).

    using these techniques, one can let the user query, and produce eg. an abstract which is a function of the query in real time. For example, an article on "Amiga peripherals" would produce a short abstract for a query looking for "microcomputer", but a longer one for a query looking for "Amiga" or for "peripherals".

    This seminar showed once more that dynamic treatment of text is of great importance to the user who actually wants to make the machine locate information rather than browsing through it himself until he stumbles upon what he's looking for.