SWAD-Europe Deliverable 3.12: Developer Workshop Report 5 - Image annotation

Project name:: Semantic Web Advanced Development for Europe (SWAD-Europe)
Project Number:: IST-2001-34732
Workpackage name:: 3 Dissemination and Implementation
Workpackage description:: http://www.w3.org/2001/sw/Europe/plan/workpackages/live/esw-wp-3
Deliverable title:: 3.12 Developer Workshop Report 5
URI:: http://www.w3.org/2001/sw/Europe/reports/dev_workshop_report_5
Author:: Charles McCathieNevile
Abstract:: This report summarises the first developer workshop, held in Madrid on 7-8 June 2004. The workshop explored the topic of image description and annotations.
STATUS:: This is a completed report published Tuesday 27 July 2004. The first draft was published on 15 June 2004. This report may be updated over the life of the SWAD-Europe Project to link to new work emerging on the topics of the workshop.

Executive Summary

Introduction
Background
Workshop
Outcomes

Appendix A. Tools

Appendix B. Vocabularies

Executive Summary

This workshop brought together developers and users working on the semantic description and annotation of images.

The workshop was hosted by Facultad de informÃ¡tica, at the Universidad PolitÃ©cnica de Madrid

The image annotation workshop had the following outcomes:

A survey of some existing annotation systems based on Semantic Web technology, with particular reference to the state of development since the early SWAD-E workshop on this topic held in Bristol, June 2002.
Development of existing image annotation work to cover multimedia
Further development of tools for image annotation.
Ongoing discussion of uses for annotated images, and of techniques and tools
Recognition of some important areas for future work
Creation of a new W3C-hosted mailing list to discuss this particular topic.

1 Introduction

This report is part of the SWAD-Europe project Work package 3: Dissemination and Implementation.

It describes a developer workshop held in Madrid in June 2004, on the topic of image description and annotation.

2 Background

A short list of background reading for workshop participants is available.

Image Annotation

Several systems have been developed for annotating image content, for several different use cases. In particular resource discovery, searching through information which is in graphic form, and providing alternative representations to people with disabilities, have given rise to annotation systems and databases.

Standards

There are a number of systems or vocabularies in use in a range of tools, including

Dublin Core
MPEG in its various forms
RDF
SVG
XMP

In addition there are a large number of vocabularies in use for specific types of desscription. There is currently a lot of overlap between deployed vocabularies, but relatively little use of the ability of OWL or RDF/S to provide mappings between different vocabularies.

Related work

A developer workshop on this topic was held in Bristol, in June 2002, as the first SWAD-E developer workshop. In part as a result of that workshop there has been development in many aspects of the topic. In addition there has been a growth in this area since that time.

3 Workshop

The workshop was attended by developers from

Australia
Denmark
France
the Netherlands
Spain
Sweden
United Kingdom

It was broadcast via the #rdfig IRC channel, allowing remote participation. In particular developers from the UK and the USA took part in the discussion during relevant stages of the workshop.

4. Outcomes

Technical discussion

A number of tools and vocabularies were presented or discussed, and are listed with a short description in appendices A (Tools) and B (Vocabularies). The first day's discussion log, the first day's "chumped" highights, the second day log and second day's highights are all available.

It was clear that in some areas there is a lot of development, and tools are moving towards the standards of products developed commercially for end users, while other areas still involved research and development.

In particular, tools dealing with time or location in any complexity tend to be in the early phases of development. One SWAD-Europe deliverable, demonstrating the use of RDF to build user interfaces for geographic information, will be updated in part as a result of the workshop.

Enhancements, improvements in functionality, modelling or standards conformance were suggested in a number of areas. Particular detailed recommendations that came out of the workshop included

CCF: How to use the jibbering tools to identify or search for a visual representation of a concept.; How to use Creative Commons and similar vocabularies to determine whether a particular symbol can be freely used (typically in commercial systems the symbols themselves are proprietary, which can be a major barrier to communication between people who have different systems).
Region Vocabulary: A number of technical improvements to the specification were suggested.

One outcome of the workshop is that a discussion list covering the specific topic of describing and annotating multimedia resources has been created. It is under the aëgis of the W3C Semantic Web Interest Group, in order to ensure long-term viability that could not be provided if it were reliant on SWAD-E project resources.

Work still needed

A clear outcome of the workshop was the need for simple step-by-step explanations of how to use vocabularies, oriented to developers who want to copy working examples rather than understand the entire theoretical base and then deriving their own tools and code.

Interface development and underlying interoperable vocabularies for dealing with time and place at varying levels of granularity are a common use case that is currently not well addressed. Some progress has been made in the area of calendar-type data, but more tools for querying across different levels of granularity are needed.

Current semantic web-based image description systems tend to be heavily text-oriented. Development of more tools which worked from graphic objects, for example to find things based on a picture of an example, or find pictures which have a few graphical features (a sunset, a cup, a particular person) is an area where more work seems useful.

Image annotation and Semantic Web Standardisation

A number of the areas that are still in development in the overall Semantic Web - trust and authority management, interoperability of information that deals with continua (such time, location) where different applications use different granularities and scales - are important to image description and annotation.

Image annotation provides an interesting approach to internationalisation and localisation of information, since most graphic material is not dependent on a particular written or spoken language. Interfaces should anticipate multilingual use, and tools could take advantage of assisted translation (for example through SKOS systems) to support partial description being supplied and used across multiple languages, instead of being restricted to a single language.

Appendix A Tools

Amaya: A browser/editor for the Web which has been developed by W3C/INRIA. This tool includes a user interface for annotations which can be made on SVG images or parts of the image.
As a result of the workshop, further suggestions for improvements to the image annotation functionality have been made, and are expected to be incorporated into the Amaya development plan.
Batch Annotator: A tool from Morten Frederiksen, supporting semi-automated generation of RDF for each object in a large collection
CCF demo: Developed by the Concept Coding Framework group, in which the SWAD-E project has participated, these tools allow a user to create a message using symbols in place of words. The message is stored in RDF (a vocabulary derived from SKOS), which allows the user to change the symbol set used.
Foafnaut: A browser for FOAF information, in SVG, that searches for images of people based on RDF information.
Image filtering tools.: Developed by Dan Brickley, this is a small set of tools that integrate annotations identifying objects with filtering algorithms to provide highlighting (or anonymisation by blurring) of regions of pictures identified according to user defined schemes.
Image metadata default: Some notes contributed by Norm Walsh to the workshop
Jibbering.com FOAF tools: Developed by Jim Ley, these are a group of tools for annotating images with metadata and then using annotated images. One feature of these tools is the ability to store path information in RDF that can be used to generate images in SVG format based on regions of the originally annotated image.
As a result of the first workshop, these tools were updated, and as a result of this workshop they were again updated.
Jibbering.com photosearch: A tool which can provide images of particular objects or people, where they have been identified in the context of a larger image.; As a result of the workshop, this tool is likely to serve as a model for a new approach to CCF messaging using real photographs of events.
Jibbering.com SVG Whiteboard: Also developed by Jim Ley, this is a shared whiteboard developed in SVG. Following the workshop, it is hoped that this will be updated.
Kanzaki image annotation: A tool with a Web interface that allows users to annotate a photograph, or select a rectangular region of it to describe.
Map tool: An intereactive map, presented via SVG, that allows a user to select a region visually and then select a particular defined region or set of regions (for example "within 50 miles of a fixed point", an adminnstrative region such as a post code or country, "what Charles thinks of as Venice", "close to Budapest airport"), generate RDF to declare that they are in that region, and search for other people who are in the same region, where necessary merging information about people's location to suggest they are in the selected region or one that partially overlaps. This tool was developed as part of the SWAD-E project
As a result of the workshop, this tool will be updated and modularised, so that it functions more efficiently and so one part of it can be incorporated as a user interface component for generating RDF about location within other tools.
Mia: Multimedia Information Analysis, a dutch project looking at ways to analyse multimedia and produce useful semantic information
Mindswap: A mindswap example - contribution from Jim Hendler to the workshop via IRC.
Qbic: A tool for searching for images based on graphical features (colour, shapes, etc). An example of Qbic searching over a museum archive
RDFPic: A tool developed by W3C/INRIA for adding RDF metadata directly to images. This tool has a corresponding module developed which allows the Jigsaw server to accept requests for RDF or JPEG formats and serve the appropriate format from a single JPEG image.
As a result of the Bristol workshop this tool was updated to support adding RDF path information that can be used to generate SVG defining regions, for example to use with FOAF tools
RDFPic Extended (mixed site in english, french and spanish): A tool developed by Vincent Tabard, in collaboration with FundaciÃ³n Sidar, based on RDFPic from W3C. This version functions with an infrastructure of Apache/PHP, and is designed to function via a Web-based interface.
RDFWeb CoDepiction tools: Developed by Dan Brickley, Libby Miller, and Damian Steer, these are an RDF vocabulary and a suite of web-based services for describing information relating people who appear together in photographs. They include a system for generating metadata about the people depicted in an image, a database which collects references to such information, and tools which query the information using the Squish query language.
Swordfish annotation tool: A tool developed as part of the SWAD-E project to assist users in annotating images
Tidepool: Semantic Web tools for social networking
W3Photo: A commercial site offering the ability to describe photographs of people at an event (The World Wide Web Conference 2004, in New York) and an interface for querying the information to retrieve images.
WH4: A tool that runs as a program in IRC, and allows searching for images of given people or things.

Appendix B Vocabularies

AccLIP: An RDF vocabulary from the IMS project, describing needs of users with respect to their abilities to use different types of information according to complexity, or sensory requirements (vision, hearing, etc). Used in context with EARL and CC/PP to provide appropriate material primarily in learning contexts, for example to determine which of a range of educationally valuable alternatives for teaching content is the most useful for a given user.
Annotea: A protocol for annotations on Web content, using RDF. The project includes development of a server, a client implementation in Amaya, and the protocol itself.
Concept Coding Framework: A SKOS-based vocabulary for identifying concepts and symbols that are used to represent them graphically for people who are unable to use spoken or written language to communicate.
CC/PP: An RDf vocabulary describing capacities and preferences of a user and their device, used for determining an appropriate version of content to serve to them. It has been implemented in mobile telephones, in particular describing the multimedia capacities (screen, audio, etc)
Creative Commons: An RDF vocabulary describing intellectual property rights and permitted usage of resources.
Cyc: A very large ontology, designed to identify many processes in use in everyday life.
Dublin Core: A very widely used vocabulary primarily for bibliographic metadata. While this is perhaps the most widely-used vocabulary (consisting of a simple set of 15 elements with a very large number of "qualifiers"), its usage is often very inconsistent. In addition, because it doesn't specify a syntax, there are different versions of it in deployment.
EXIF: A vocabulary used by many Digital cameras to describe various technical aspects of a photograph.
FOAF: A vocabulary for describing people, including depictions of them. The depiction and depicts properties of this vocabulary are often used in image annotation systems
Geo latlong: A small vocabulary that describes location in terms of latitude, longitude, and elevation.
ICRA (PICS): A vocabulary used to describe whether the content of a resource contains particular levels of sexual content, violence, or other content that may be considered objectionable. It has been widely used in web browsing environments to exclude content according to a user-defined set of preferences
Region: A small vocabulary designed to provide the ability to identify a region of an image. This was discussed and some suggestions for further development were made during the workshop.
SKOS: A framework for developing RDF-based thesauri.
Wordnet: A large vocabulary of english nouns, often used to identify objects depicted in pictures. There are actually several different encodings of Wordnet in RDF, and the W3C's Semantic Web Best Practices and Deployment group is has a Wordnet task force working in this area.
XMP: A subset of RDF used by Adobe in their software range, allowing for the insertion of metadata in a range of documents and multimedia.

SWAD-Europe Deliverable 3.12: Developer Workshop Report 5 - Image annotation

Contents

Executive Summary

1 Introduction

2 Background

Image Annotation

Standards

Related work

3 Workshop

4. Outcomes

Technical discussion

Work still needed

Image annotation and Semantic Web Standardisation

Appendix A Tools

Appendix B Vocabularies