Knowledge-Guided Resource Location

John C. Mallery
Intelligent Information Infrastructure Project
Artificial Intelligence Laboratory
Massachusetts Institute of Technology

December 10, 1995


The spectacular explosion of the World Wide Web from an experiment at CERN into a global information infrastructure over the past two years has created a tremendous need for intelligent systems to connect information seekers with resources offered by providers. For the Artificial Intelligence community, this historic development affords an opportunity to deliver intelligent systems that fulfill real world needs. Smart computer agents should help people navigate global information Webs, as they harvest knowledge to make themselves more effective assistants.

This position statement suggests steps required to introduce knowledge representation into the infrastructure. The objective is to annotate opaque Web resources with descriptions that are transparently understood by computers. These intelligent assistants will help people locate and manage information. Over time, I expect the World Wide Web to gradually shade into the World Wide Knowledge Web as increasing quantities of assertional knowledge become associated with Web resources. Here, I focus on knowledge-guided resource location as an appropriate starting point, and defer many other exciting possibilities for later.

Problem: Finding Relevant Information

To use the Web effectively, people need to be able to locate information resources using retrieval methods that meet the following criteria:

Current retrieval methods on the Web do not meet all these criteria. They offer no guarantees of completeness or correctness. They are usually well out of date because walking the large contemporary Web takes so long. They do not consider felicity, but typing in a few words is quite transparent for users, even if the search results returned are unintuitive.

Opportunity: Intelligent Indexation & Retrieval

Ultimately, centralized indices do not scale well. Consequently, a superior indexation architecture must be distributed around the Web and built into the infrastructure. Knowledge representation techniques go beyond the shortcomings of keyword search because they have sufficient expressive power to represent natural language. Significantly, these techniques preserve much of the grammatical and referential structure of text. Consequently, they index knowledge with fine enough granularity to answer questions about the information, and even make simple inferences about implicit knowledge. Most importantly, complete and correct retrieval systems can be built with certain knowledge representation systems based on ternary relations.

Solution: Knowledge-Based Name Service

Action: Steps Towards Knowledge Webs

  1. HTTP Method: Create a specification for an HTTP extension method that allows Web servers to operate as assertion servers, accepting assertions and responding to queries.
  2. Light-Weight Knowledge Base: Develop a reference knowledge base to backend the HTTP extension method.
  3. Knowledge-based Name Scheme: Create a specification for a Universal Resource Name (URN) scheme that allows users to resolve resource names based on constraint descriptions. The constraint descriptions are resolved against assertions about the names made by providers and others. (See also WWW95 Developers Day Panel)
  4. Indexation: Develop an indexation mechanism that allows URN queries to be resolved quickly using shared indices in the infrastructure.
  5. Caching: Develop a caching mechanism that migrates assertional knowledge towards users, and thus, minimizes latency and bandwidth consumption.

Issues: Architectural Challenges

Conclusions

At the M.I.T. Artificial Intelligence Laboratory, the 1994 wide-area collaboration experiment for the Vice President's Open Meeting on the National Performance Review used taxonomic structure and argument connectives in a persistent knowledge representation to connect people with information and each other. Already, Boris Katz has a user interface agent that answers English questions about Web resources. Try out his START System. In contrast, the RELATUS Natural Language Environment implements completely self-indexed knowledge base and provides a constraint-guided reference system. All of these systems use ternary relations to represent arbitrarily-expressive knowledge found in natural languages. I have argued elsewhere that the design principles behind complete self-indexation and efficient constraint-guided reference can be extended into the infrastructure to support wide-area knowledge representations. A World Wide Web Consortium working group is developing proposals along these lines. Contact me for further information.

References

Roger Hurwitz & John C. Mallery, ``The Open Meeting: A Web-Based System for Conferencing and Collaboration,'' Proceedings of The Fourth International Conference on The World-Wide Web, Boston: MIT, December 12, 1995.

John C. Mallery, ``Semantic Content Analysis: A New Methodology for The RELATUS Natural Language Environment,'' in Artificial Intelligence and International Politics, V. Hudson, ed., Boulder: Westview Press, 1991. Postscript.

John C. Mallery, ``Wide-Area Knowledge Representation: A Foundation for the Noosphere,'' invited presentation in the Distributed Objects and Procedures session at the ACM SIGCOMM'95 Workshop on Middleware, Cambridge, August 28-29, 1995.Postscript.

John C. Mallery, Roger Hurwitz & Gavan Duffy, ``Hermeneutics,'' The Encyclopedia of Artificial Intelligence, New York: John Wiley & Sons, 1987. Postscript.