KR and the Web

Adam Farquhar, Knowledge Systems Lab, (afarquhar@ksl.stanford.edu).

Knowledge Representation Needs the Web

Acquiring and representing knowledge is the key to building large and powerful intelligent systems. Unfortunately, knowledge base construction is difficult and time consuming, and, in most cases, developing an intelligent system requires constructing an entire new knowledge base. As a result, most knowledge-based systems remain small to medium in size and require substantial knowledge engineering expertise to develop. A promising approach to overcoming these difficulties is to develop techniques for encoding knowledge in a reusable form so that large portions of a knowledge base for a given application can be assembled from knowledge repositories and other systems.

Although notable progress has being made in the technology for representing knowledge in reusable forms, substantial repositories of such knowledge do not yet exist. One of the key problems is that usefully representing knowledge for multiple uses is an inherently distributed, collaborative task.

Example: doctors, hospitals, insurance companies, and government agencies want to use a common vocabulary for referring to therapeutic drugs.

No one person has the expertise. No one group has the authority. Constructing such a vocabulary requires collaboration between multiple authors, reviewers, and critics. Any tool that supports a single user at a single workstation cannot solve this task.

At the Stanford Knowledge Systems Lab, the Web has become a key component in our approach to overcoming this limitation.

We have developed a Web-based Ontology Server, which is publicly available at http://www-ksl-svc.stanford.edu:5915/. Users employ their familiar off-the-shelf Web browsers to interact with the server. The Ontology Server is a sophisticated environment for accessing, assembling, editing, and browsing ontologies. It has been publicly available since 2/95 and, as of 12/95, supports over 150 active users by groups world-wide. The multi-threaded server supports collaboration, persistent state, multi-user sessions, and multiple network APIs (see our CHI96 paper for a discussion of its design).

The Ontology Server provides access to a growing ontology library of representational modules. Users can assemble, modify, and extend these modules to form new representations. They can add their new components to the library so that other users (and programs) access and extend them. The ontologies can be both imported and exported in multiple formats, including stand-alone HTML documents.

The Web Needs Knowledge Representation

Today the Web is rightly generating a tremendous amount of excitement. It gives us rapid access to information that we could not have found before. Tomorrow, however, we will expect even more than instant access. We will want software tools that will use that information to solve problems. We will want to turn that information into knowledge.

Example: Combine the information in the classified ads, Consumer Reports evaluations, and the "Blue Book" to generate a list of used cars that are recommended and within 15% of Blue Book value. And get the manufacturer's information on them as well.

The wealth of information that is becoming available will support this if we can learn to extract knowledge from it and develop techniques for authors to characterize the meaning of their data.

At the Knowledge Systems Lab, we are developing network based Information Brokers that will supply information seeking clients with access to the information in multiple sources as well as information providers with techniques to effectively describe the meaning of their data.

The Web protocols themselves would benefit from a richer representation of intended meaning. Currently, a hyperlink between two "documents" states that there is some unspecified relation between them. There is, however, an implicit ontology of these links. If authors could make the meaning of the links explicit, then browsers and other tools could provide more suitable interactions to users. HTML has some elements that suggest semantics (e.g., address, code, listing, variable), but they are only used to provide syntactic markup.

Example: The browser renders PRO links from POSITION paragraphs in green, and CON links in red.

A browser can render this sort of semantic markup uniformly for a user. The current alternative is for each author to specify an idiosyncratic color for text or links. Given the variety of browsers and computing environments, any such decision is unlikely to work well for large numbers of users. The current movement towards including procedural components (Java, etc.) in documents will have an unfortunate side-effect: the meaning of a document will be much more opaque.

Adam Farquhar