Photo RDF, Metadata and Pictures

Project acronym: QUESTION-HOW
Project Full Title:Quality Engineering Solutions via Tools, Information and Outreach for the New Highly-enriched Offerings from W3C: Evolving the Web in Europe
Project/Contract No. IST-2000-28767
Workpackage 1, Deliverable D1.3

Project Manager: Daniel Dardailler <danield@w3.org>
Author of this document: Yves Lafon <ylafon@w3.org>

Table of Content:

Introduction

The tools developed in this project are aiming at demonstrating an easy way of creating metadata, as well as ways of associating metadata to the content to avoid losing one part for lack of synchronism between the content and the metadata.

Concept

The main concept is to generate a description of the picture using RDF. RDF is model for representing named properties and property values. It is then very easy to describe basic properties of the picture: Picture -author-> Yves, Picture -location-> Sophia-Antipolis.

Most properties applicable to describe pictures are already defined by the Dublin Core Metadata Initiative. Some other properties can be added, like more technical details on the picture itself, what tool was used to generate it, the value of some color correction filters, and such. The metadata has to be extensible (hence the use of RDF) to allow users to define their own property sets and add them to the pictures.

Interface

Adding metadata has to be as simple as possible. The best way is to have a pre-set list of attributes and change only the ones needed, especially when you have to enter metadata of a huge batch of files. The interface is the following (this picture is for version 2, the current version is 3):

On the right side, the picture, and on the left side a list of properties (Title, Creator...) and pull-down menus of values containing the current values and the previously entered ones. The tick on the left of the value is a "sticky" tag, it means that the value will not change for the next picture.

Many RDF Schemas can be loaded, they will generate more entries in the tab panel, after the default DC (Dublin Core), Technical and FOAF Schemas.

The rdfpic program can automatically build a user interface for two kinds of RDF Schemas: those that associate properties with the whole of the image, such as the Dublin Core (in the illustration above) and the Technical Schema; and those that associate properties with a region of the image, such as the FOAF Schema, which allows people in the image to be described. For the latter kind of Schema, rdfpic provides an outline drawing tool, that allows the user to trace the outline of an object in the picture with the mouse.

The user interface can be switched to other languages, provided the loaded RDF Schemas support those languages. The three default Schemas that are delivered with the program contain English, French and Dutch descriptions.

Metadata storage

As explained previously one of the key point of this project is to associate the content and its description; it is done by sticking the generated RDF description in a header of the JPEG file. That way, if the file is moved, its description will never be lost. Version 2 of the rdfpic program used the Comment field of the JPEG file, version 3 also recognizes APP fields, which is where Adobe's tools (and others based on Adobe's "XMP" specification) store the RDF metadata.

Several third-party tools can be used to extract such comment fields from a JPEG file, making it easy to crawl lists of files and extract all the descriptions from them. Such tools have been used, for example, to create "thumbnail" pages for a collection of photos.

Also, image files with embedded metadata allow Web servers to select what needs to be send to a client, using the "Accept:" header of HTTP. Clients accepting images will get the whole file and view the picture; text-only clients may want to retreive only the description of the picture and not the whole file. In Jigsaw, you can do this, by setting the MIME type of the comment field. The server will automatically extract the metadata on demand if the client ask for this MIME type.

Download

The rdfpic program is written in Java. It is available, both as source code and as a ready-to-run binary, at http://jigsaw.w3.org/rdfpic

Note that the two programs are independent: Jigsaw is a standard, general purpose HTTP server; rdfpic reads and writes standard JPEG files, that can be served by any Web server. However, the HTTP negation mentioned in the previous section is so far only available with Jigsaw.