SWAD-Europe Deliverable 12.2.2: RDF-based annotation systems

Project name:
Semantic Web Advanced Development for Europe (SWAD-Europe)
Project Number:
IST-2001-34732
Workpackage name:
12.2. Annotations demonstrator
Workpackage description:
http://www.w3.org/2001/sw/Europe/plan/workpackages/live/esw-wp-12.2.html
Deliverable title:
12.2.2: Annotation Demonstration Server Report
URI:
http://www.w3.org/2001/sw/Europe/reports/annotation_demo_server_report
Authors:
Charles McCathieNevile, W3C
Abstract:
This report surveys the state of semantic web systems to support annotation, editing or commenting on documents by people or agents who need not have any control over the original form of the document.
Status:

First version published 2002-12-03. This is a completed report, last updated 2003-06-06.
This document may be updated during the life of the SWAD-Europe project to reflect or link to further developments in this area.

Comments on this document are welcome and should be sent to the public-esw@w3.org list, archived at http://lists.w3.org/Archives/Public/public-esw/. General discussion of annotation techniques and especially Annotea should be sent to the W3C Annotation list www-annotation@w3.org, archived at http://lists.w3.org/Archives/Public/www-annotation/

Contents


1 Introduction
2 Background
3 Annotea Protocol
4 Implementation
5 Collaboration
6 Future Work
7 Outcomes
8 Frequently Asked Questions (FAQs)
A References - Publications
B References - Tools and Projects


1 Introduction

This report is part of SWAD-Europe Work package 12.2: Annotations Demonstrator and addresses the scope, features and purpose of tools for annotating or commenting on web data using existing systems that are licensed as Free Software or Open Source.

For those in a hurry: go straight to the FAQs section

Scope

This report covers annotation systems known to be using the Annotea protocol, including systems developed for the SWAD-Europe project and others developed independently. It also mentions development that appears to be closely related.

Terminology

Annotation
This term is used to describe information which is explicitly commentary on a Web resource, and which can be discovered by some method using that resource as a key. Simple mechanisms include the use of the rel="rev" attribute to describe links in HTML. More powerful systems allow for flexible storage options and sophisticated lookup.
Annotea
Annotea is used in this document to refer to the Annotea protocol developed by W3C. The name is occasionallly also used to refer to the user interface for that protocol in the Amaya client, and for some servers for that protocol.

2 Background

The Web as deployed today provides a simple, powerful system for making links between information and services. These links are one-way - from some information describing a resource to the resource itself. This has proven to be extremely useful, but it is difficult to search such a system since it requires keeping large tables of data. The Semantic Web consists primarily of data about other data, or of machine-processable data designed for aggregation. In each case, finding information about a resource from that resoure has proven difficult.

Several annotation systems have been implmented in the Web. Early approaches included collecting Web pages (still done by organisations like Google, but impractical for most organisations). HTML also allowed the attribute rel="rev" to be attached to a link in a document, indicating resources that described or in some way linked (forward) to the document. This in principle allows for harvesting, but again, scale is a problem in implementing this approach.

More recent services for providing specific types of annotation (reviews of documents or movies, for example) have proven to be successful in enabling lookup of annotations by using the resource they annotate as a key in limited circmstances. These annotations have normally been plain text encapsulated within a closed system.

W3C has been developing the Annotea protocol, and clients and servers that use it, to provide a standards-based, open and flexible way of allowing annotation of documents.

3 Annotea protocol

The Annotea protocol is in development within W3C as an advanced development project. Since this is considered pre-standardisation it is not being developed as a W3C standards-track specification, but rather as a demonstration.

The protocol [PROTOCOL] is documented, and is implemented in development within the Amaya client, and as a module for Apache servers.

The protocol was chosen for this project because it is based on RDF, and developed already within the context of W3C. Different open source implementations for many parts of the protocol are readily available, wihch makes it a good framework for developing interoperable applications.

4 Implementation

Some further development of tools has been undertaken as part of the SWAD-Europe work package on annotations, leading to a library of tools [ANNOTOOLS] for use with the protocol as publicly documented. These include utility functions which can be incorporated into other software or serve as simple models for copying. They are open source, available under the terms of the W3C software copyright license [LICENSE], a BSD-style license.

The MUTAT evaluation tool [MUTAT] was adapted to use Annotea as a storage and retrieval system, working with the experimental EARL server provided by W3C.

A more complete list of known implementations [IMPLEMENTATIONS] is maintained by the Annotea project at W3C with the assistance of this project.

5 Collaboration

Other implementation of Annotea-based systems has taken place both within and outside Europe.

An experimental Annotea aggregating query server was developed, making use of the ruby-rdf work funded by the SWAD-E project. Currently the source code is available under the GPL license, as it uses a GPL-licensed HTTP server.

Work has been done collaborating with European developers of Annotea-based software. In particular in the use of EARL, an RDF vocabulary for supporting Quality Assurance process, collaboration with developers and W3C's Web Accessibility Initiative has led to the development of existing tools to take advantage of Annotea.

The ZAnnot server was packaged for easy installation under Mac OS X using the system provided by the fink project, with simple ZAnnot installation instructions provided for a variety of platforms.

This project has funded some European participation in the Annotea project at W3C.

6 Dissemination

As part of this project the use of Annotea was presented as an option for both reporting toos using EARL, and for annotating images, at the workshop on EARL and Image annotation held in Bristol in June 2002.

Using W3C's public annotation list [WWW-ANNOTATION] discussion has also taken place on Annotea and annotation systems in general, and leveraging the properties of the Semantic Web to provide for more powerful and flexible approaches integrated with the wider technology of the Web.

SWAD-E resources have been used within W3C to ensure that European Annotea work is included in the documentation produced by the Annotea project.

7 Future work

Development work is being undertaken on incorporating Annotea into software development. As part of the project W3C is assisting the supervision of open-source student projects at ESSI in France to use Annotea for adding value to tools for improving the knowledge management capacity of IRC, and incorporating Annotea into an open source accessibility evaluation tool.

8 Outcomes

9 Frequently Asked Questions (FAQ)

What software is available for Annotea?

A list of known software [IMPLEMENTATIONS] is maintained by the Annotea project. It includes Annotea servers and Annotea clients.

What platforms do Annotea clients run on?

Clients are available for a number of platforms. Amaya is distributed in release form for Windows, Linux, and available for many other variants of Unix including OS X. There are client tools in Javascript for use within a Javascript-capable Web browser. The tools developed as part of this project [ANNOTOOLS] run in Ruby, a language that can easily be installed on many systems including Windows and Unix-based platforms.

What servers are available?

There are open source Annotea servers written in PERL for Apache, and in python for the zope system. There is also a query server written in ruby - it does not accept annotations, but can aggregate responses from multiple servers. There is a list of known servers [IMPLEMENTATIONS] maintained as part of the Annotea project.

How do I extend the protocol?

Because the Protocol is in RDF it can be readily extended. An example is the work done within the Annotea project to provide threaded replies to annotations. The reply schema is explained in the general the documentation of the protocol [PROTOCOL], but uses a new schema for threaded annotations [ANNOTEA-R] developed for this extension.

Where do I find the Protocol and Schemas?

The protocol [PROTOCOL] is documented (with links to the schemas) as part of the Annotea Project.

References

Annotation tools, specifications and documents

[ANNOTEA-R]
The Schema for threaded annotations, developed to allow for threaded responses, is available at http://www.w3.org/2001/03/thread
[ANNOTOOLS]
A small library of tools developed for use with the Annotea protocol, and as example code for people wanting to develop their own tools. These tools are described at http://www.w3.org/2001/sw/Europe/200209/annodemo/readme.html
[IMPLEMENTATIONS]
A list of known Annotea Implementations is maintained by W3C's Annotea project at http://www.w3.org/2001/Annotea/#Comp
[MUTAT]
The Open Source MUTAT tool is designed to provide an interview-style interface for producing conformance reports in the EARL [EARL] format. It has an extension which allows the reports to be posted as annotations to an annotea server. An online version of the Tool is available at http://www.w3.org/QA/Tools/MUTAT/
[PROTOCOL]
The Annotea protocol is documented at http://www.w3.org/2001/Annotea/User/Protocol
[WWW-ANNOTATION]
The www-annotation@w3.org mailing list is a public discussion forum for annotation systems, including Annotea. Its archives (including instructions for subscribing) are available at http://lists.w3.org/Archives/Public/www-annotation/

Other references

[EARL]
EARL (The Evaluation and Reporting Language) is a specification in development by the W3C's Evaluation and Repair Tools group. It is an RDF vocabulary for expressing conformance to arbitrary requirements. The Latest published draft is available at http://www.w3.org/TR/EARL10
[LICENSE]
The W3C Software Copyright license is a BSD-style license allowing the free use of software in open-source or proprietary products with appropriate acknowledgement. The full license is available at http://www.w3.org/Consortium/Legal/copyright-software