Semantic Web Development

Intent of Work

Feb, 2002

Principal Investigator:
Tim Berners-Lee, MIT W3C <timbl@w3.org>
Co-Principal Investigators:
David R. Karger <Karger@mit.edu>, Lynn Andrea Stein <las@olin.edu>, Ralph R. Swick <swick@w3.org>, Daniel J. Weitzner <djweitzner@w3.org>

Introduction

This document outlines the work to be done by MIT/LCS under the MIT/AFRL cooperative agreement number F30602-00-2-0593 during the twelve months beginning January 2002. It responds to a request from Murray Burke, DAML Program Manager, for informal information both as to technical work and also collaborative work with other groups. In places these are difficult to separate, as the role of the MIT/LCS team includes in large part the liaison between the basic research and the deployment path foreseen via the World Wide Web Consortium co-hosted by MIT/LCS, INRIA (Institut National de Recherche en Informatique et Automatique), and Keio Keio University of Japan. (This is, as requested, a best guess and in no way consitutes a commitment.)

Basic common tools

We are building parsers, proof generators, and proof checkers based upon the W3C Resource Description Framework (RDF) and the DAML Ontology Language (DAML+OIL). We are specifying a framework, the Semantic Web Logic Language (SWeLL), on top of RDF and DAML+OIL in which a variety of logic system can be expressed for interchange between applications.

We have developed several basic tools for working with RDF, DAML+OIL, and SWeLL; the swap, algae, and blindfold toolkits include:

These are the start of our breadboard system for constructing intercommunication pipelines of components consisting of parsers, data stores, proof checkers, and other RDF/DAML processing modules. This intercommunication breadboard system will allow alternate implementations of each component to be substituted to meet the requirements of specific applications.

The components we expect to have substantially complete during these twelve months are:

We expect these components to be useful for the 2002 DAML Experiment, as they become available.

Specific Applications

Access Control

Technical Goal: documents stored in a SWeLL-mediated Web server are protected by rules that express authority to access the document based upon properties of the document in addition to properties of the requestor.

We expect to have within 12 months a partial implementation of an HTTP server that does RDF/DAML-ONT/SWeLL proof checking for qualifying access to Web resources.

Example: a rule may state that any document created at a meeting is readable by all the participants who were present at that meeting. Then, the electronic log of a meeting that was constructed using collaborative meeting facilitation tools is by default accessible to each participant. The meeting participants and the documents produced during the meeting are themselves recorded using RDF/DAML.

Schedule Coordination and Dependency Tracking

Technical Goal: Manage document workflow within an organization such as the W3C by deriving the status of a document from messages written in RDF/DAML/SWeLL that affect the state of the document. Manage dependencies by implementing export and import of schedule and calendaring information from a minimum of two popular calendaring /ToDo list management systems to RDF/DAML/SWeLL form.

Scenario: in a highly dynamic organization such as the W3C resources may be required on short notice that have been committed to other tasks. Proof that a meeting can occur at which the resources required to reach a decision are able to be present will depend on the ability to identify all the resources; including personnel, meeting facilities (room, teleconference system), and prerequisite documents. Any participant can use this proof to synchronize independent databases including personal planners. Proofs that a meeting took place at which all prerequisites were met and a decision was taken, become messages that state, for example, that a document progressed from Working Draft to Last Call Working Draft.

Example: The W3C teleconference schedule is a single Web page listing the times and (virtual) locations of the teleconferences for each Working Group. This page is presently maintained by hand and duplicates information that is distributed separately in other forms. It would be more accurate and simpler to maintain if it was derived from RDF/DAML/SWeLL messages exchanged with the individual working groups.

The W3C document publication workflow process will be the testbed for this work. SWeLL rules specify which messages are authoritative in determining document state. These messages are processed from a variety of sources including for example e-mail, irc (synchronous text messaging), and HTTP PUT and POST operations.

We will not be working on user-friendly read/write interfaces to calendar information.

Personal Information management Schema (PINS)

It is increasingly straightforward to automate various inferences that people make when moving knowledge from one document to another; but as we do so, we must not forget that when people move information around, they excercise discretion about which pieces of information are intended for which audiences.

Example: We have supplimented traditional teleconferencing facilities with and automated agent, Zakim, that offers web-based interfaces for listing conference participants etc. Zakim collects information about the correspondence between telephone numbers and names in order to make the display of conference participants easier to understand. But it is careful not to disclose telephone numbers; that is: it is carefully implemented using various tricks to hide (parts of) phone numbers etc. With a personal information management schema, Zakim will be able to collect not only telephone-number-to-name mappings, but also information about who else Zakim is licensed to share this information with, and when.

Annotea: Web-based collaboration

The Annotea project uses an annotation server built from generic RDF components (the algae parser, generator, store, and query engine) that communicates with clients using an HTTP-based protocol. The main client is Amaya, a web browser/editor, which has been enhanced to support shared annotations integrated into the browsing and authoring experience.

We plan to enhance Annotea with shared bookmar facilities to support collaborative cataloging, classification, and organization of web resources.

Example: W3C hosts hundreds of mailing lists with archives available via HTTP. Due to an overwhelming load of unsolicited commercial email (spam), the archives are increasingly difficult to navigate. The burden of filtering the spam from the index can be shared among the users of the archives: anyone can use Amaya to annotate messages, categorizing them as spam. An enhanced index builder can integrate the results of querying the annotation server to filter out spam.

Haystack: Natural User Interface built upon RDF repository

Haystack gives the user a convenient interface to search their own corpora of knowledge. User will be able to import a variety of their typical information types (documents, email, calendar, web pages) into a single unified RDF repository. User interface will enable unified access to all of this information for organization, navigation, and search.

Transition to standards - W3C liaison

Objective: DAML research work is transitioned into industry standards-track activities at the earliest feasible time.

The World Wide Web Consortium is an industry consortium created to lead the World Wide Web to its full potential by developing common protocols that promote its evolution and ensure its interoperability. W3C has more than 500 Member organizations from around the world. W3C is responsible for developing the XML and RDF standards and for managing the evolution of these standards.

The language specifications for the DAML work build upon XML and RDF layers. A Web Ontology Working Group was launched Nov 2001, using the DAML+OIL specification of March 2001 as a technical baseline. The group is chartered to review this specification and its relationship to RDF, and to develop consensus in the W3C community.

As each remaining component of the DAML work reaches the stage of having significant existing practical experience and a need for open and fair process for derivation of the common language, MIT/LCS will undertake to propose to the W3C membership to begin standards-track working groups. When these working groups are formed, personnel from the MIT/LCS Semantic Web Development project will participate in order to provide liaison with the DAML work and to provide the experiences drawn from our own technology development.

Work on DAML+OIL Query and DAML+OIL rules may become ready to transition this year.

As other W3C Working Groups are chartered to produce standards in areas that may benefit from DAML technology, we will facilitate the introduction of DAML concepts into the discussions of those Working Groups. Possible examples of this include a description language for Web services and the use of XML in remote access protocols.


Tim Berners-Lee <timbl@w3.org>
Ralph R. Swick <swick@w3.org>

$Id: iow2.html,v 1.13 2002/02/11 22:47:14 connolly Exp $