Describing and retrieving photos using RDF and HTTP

W3C Note, 03 April 2000

This version:
Latest version:
Previous version:
Yves Lafon, W3C, ylafon@w3.org
Bert Bos, W3C, bert@w3.org


This note describes a project for describing & retrieving (digitized) photos with (RDF) metadata. It describes the RDF schemas, a data-entry program for quickly entering metadata for large numbers of photos, a way to serve the photos and the metadata over HTTP, and some suggestions for search methods to retrieve photos based on their descriptions.

The data-entry program has been implemented in Java, a specific Jigsaw frame has been done to retrieve the RDF from the image through HTTP. The RDF schema uses the Dublin Core schema as well as additional schemas for technical data.

We already have a demo site, and, in a few weeks, we expect to have the source code available for download. The software is OpenSource.

The system can be useful for collections of holiday snapshots as well as for more ambitious photo collections.

Status of this document

This document is a NOTE made available by the W3C for discussion only. Publication of this Note by W3C indicates no endorsement by W3C or the W3C Team, or any W3C Members.

We plan to update this note after some more experience has been gained with the system and the schemas.

Please send comments to the authors.

A list of current W3C technical reports and publications, including Working Drafts and Notes, can be found at http://www.w3.org/TR/.

Table of contents

1. Goals of the project

The goals of the project are partially personal, partially to promote W3C technology. The personal reasons are that we, the authors, have large numbers of photos but always have difficulty finding the exact ones that we want to show to somebody. Digitizing them and describing them in RDF should make it quicker to find the ones we are looking for at any moment.

We also think that a concrete example of an RDF schema and a working system around it can help explain the potential of metadata on the Web, especially since traditional, text-based search engines as they are used for HTML document will clearly not work for photos. Also, using metadata will automatically provide a non-visual description of the photos, hence contributing to accessibility.

The project, then, is to take the existing pieces of technology (RDF [RDF], HTTP [HTTP] and Jigsaw [Jigsaw] from W3C; JPEG [JPEG], Java [Java] from elsewhere) and provide some glue between them to produce an interesting as well as useful application.

2. Overview of the system

Diagram of data entry system

Diagram of the parts of the photo-RDF system. Top left: the pictures are digitized and stored as JPEG images. Bottom left: metadata is written into the pictures with the data-entry program (and can also be edited if corrections are necessary). Right: requests from the Web are served by Jigsaw, by sending either the picture or the metadata, depending on the form of the request.

The system comprises the following, largely independent, pieces:

  1. Scanning the photos and storing them in JPEG format. We scan from negatives, for best quality, but any process that yields JPEG could be used, including digital cameras. We will not deal with this part below.
  2. A data-entry program that allows easy entry/editing of the metadata for each photo and stores the data in RDF form inside the JPEG file. This program is described below.
  3. A module for the Jigsaw server that can serve either the JPEG image data or the RDF description that is stored in it, using HTTP content negotiation to determine which of the two a client wants. Described below.

Some digital cameras are already producing information about the picture, which may be read and reformatted in RDF by scripts. We will not deal with that in this version of the metadata editor.

The RDF data is expressed in three separate schemas, one of which is the Dublin Core schema. The other two deal with technical data of the photo and with subject categories. The reason for using three schemas is solely to allow each of them to be used in other projects; to the users of the data-entry program the actual RDF is completely hidden.

3. The data-entry program "rdfpic"

Screendump of the metadata edit

Screen dump of rdfpic, the metadata editor, showing the screen to enter technical data. (The screen dump has been reduced by 50%)

The data-entry program is very simple. It has been designed to enable quick entry of metadata for lots of photos, under the assumption that the photos will usually be from one or a few series. Most fields therefore show by default the value that was entered for the previous photo, and give quick access to the values entered for the last few photos. Typically, only very few fields will have to be changed from one photo to the next and the amount of typing will be minimized.

The program is written in Java, but the user interface is in fact generated at run-time directly from a machine-readable version of the schemas (not the RDF syntax, but a mechanical transformation of it, with equivalent information). This means that the program does not need to be changed when we change the RDF schemas.

The RDF data is stored in the JPEG file in comment blocks (labeled with the keyword "COM", as defined by ISO DIS 10918-1). According to the JPEG standard, a comment block can contain arbitrary text. There is no way to assign a type to the text. We simply rely on the fact that RDF can easily be distinguished from plain text by heuristics. JPEG limits each comment block to 64K, but there can be as many blocks as necessary, so arbitrary amounts of text can be added. In practice, the descriptions generated by the rdfpic program are typically only a few hundred bytes long.

4. The Jigsaw extension

To serve either the RDF version or the complete image using existing browsers and tools, the best way was to use Content Negotiation. Of course, that doesn't exclude the use of other techniques, such as HTTP extensions, to be able to retrieve and store metadata in a better way.

Using Content Negotiation has two benefits: it will work right away with all text-based browsers (lynx, emacs with emacsspeak, etc.) and the output can be rendered directly by selecting, e.g., the title or the description from the RDF. Also, an RDF crawler will be able to get all the descriptions of a collection of photos to create a knowledge database, just by asking for the right MIME type.

In Jigsaw [Jigsaw], a frame has been created, to simulate two different resources under the same URI, the one of the image itself. Those two resources have their own set of HTTP values, such as ETags, Content-Length and others and the result is sent out using the classic Content Negotiation of HTTP.

The rdf can be also fetch directly without doing Content Negotiation, just adding the wanted mime type after ';' ex: foo.jpg;text/rdf

Note that it is also possible to modify the rdf description using the PUT method, provided the ETag of the description is in the HTTP header of the request.

5. The RDF schemas

The metadata is separated into three different schemas:

  1. Dublin Core schema. The Dublin Core [DC] schema is a general schema for identifying original works, typically books and articles, but also films, paintings or photos. It contains such properties as creator, editor, title, date of publishing and publisher. It is being developed by the Dublin Core Metadata Initiative and the version we use is the RDF-format of version 1.1.
  2. Technical schema. This schema captures technical data about the photo and the camera, such as the type of camera, the type of film, the date the film was developed and the scanner and software used for digitizing.
  3. Content schema. This schema is used to categorize the subject of the photo by means of a controlled vocabulary. This schema allows photos to be retrieved based on such characteristics as portrait, group portrait, landscape, architecture, sport, animals, etc.

All the properties are optional. The more properties are given values, the better the photo will be described and the easier it will be to find it, but leaving properties undefined doesn't make the metadata invalid.

There are no dependencies between the properties: each property can be given a value independent of whether any other property has a value. The values are also independent, except for restrictions of common sense: a photo cannot have been taken after the date on which the film was developed...

5.1. The Dublin Core Schema

We don't use all properties defined by the Dublin Core (that is to say: the others can be added, but are ignored by our metadata editor). Here is an interpretation of the Dublin Core properties, applied to photo material. A machine-readable schema is included in appendix B. In parentheses the label that is shown in the user interface of rdfpic, if it is different from the property name.

a short description of the photo. Example: Marian climbs on the "elephant"
a set of keywords to describe the photo. See the content schema below for the list of keywords. Example: portrait, landscape
a longer description of the photo. Example: Marian attempts to climb on the granite rock that is nicknamed "the elephant"
creator ("author/creator")
the photographer, as a URL that can be further described with other schemas. Example: http://www.w3.org/People/Bos
the person or institution making the photo available, often the same as the creator. Example: http://www.w3.org/People/Bos
a person who contributed in some way, e.g., the person who digitized the photo; may be a URL or a name.
the date and time the photo was taken, conforming to ISO format [ISOdate]. The year is required, everything else can be omitted: yyyy[-mm[-dd[Thh:mm[:ss[.sTZD]]]]]. The default time zone is UTC. Example: 1999-10-01
always "image" (see the Dublin Core's List of Resource Types)
always "image/jpeg"
identifier ("number")
a number for the photo that is meaningful to the publisher. This is not the URL of the photo and it does not have to be globally unique. Example: 312
not used.
not used.
identifies a series: the event or topic for a series of photographs. Can be a URL or a string. Example: Marian in Le Sidobre.
coverage ("location")
the location shown on the photo. (Note that we only use the "spatial coverage," not the "temporal coverage," since we assume that a photo is instantaneous and thus the date field is enough.). Example: Le Sidobre (Tarn)
copyright statement, or the URL for one. Example: http://www.w3.org/People/Lafon/Copyright?1998

5.2. The Technical Schema

The technical schema is defined by this RDF schema (for the formal definition, see appendix B):

the brand and type of the camera, or a URL for the camera. If the latter, the URL identifies one actual camera, not all cameras of that type. Example: http://www.w3.org/People/Lafon/FooCamera8000i
the brand and type of film. In contrast to the camera property, this is not an individual roll of film, but identifies all films of the same type. (We assume films of the same type are sufficiently similar; except for fabrication errors, they are interchangeable.) The value may be a string or a URL that is further described elsewhere. As a convention, digital cameras should be considered as "digital" film. Example: Ilfoo HP5
a definition of the lens used, maybe a URI describing it, a URI pointing to the camera for compact cameras, or just plain text description. Example: FooLens AF:70-210
date on which the film was developed. The date must be in the same form as the date property. Example: 1998-08-04
any specific information about the development, from the processing lab to the chemical product used.
the source of light used, it can include its temperature or the number of light source used, usually it will be "daylight".

5.3. The content schema

The content schema contains the keywords we use in the "subject" property of the Dublin Core schema. That property should contain as many of the following keywords as are applicable. The keywords have the following meaning:

The photo contains a portrait of one person.
The photo contains a portrait of a group of people.
The photo contains a landscape or skyline.
The photo contains a baby.
The photo contains interesting buildings.
The photo contains scenes from a wedding.
The photo contains an extreme close-up and would, when viewed under normal circumstances, be larger than life-size.
The photo contains a pattern, texture or design, that is interesting for its abstract, graphic quality.
The photo contains a wide-angle view of a landscape or skyline.
The photo contains an animal.

6. Suggestions for extensions

Here are some ideas for extensions to the system that we are still studying. In no particular order:

7. The online demo

A sample server has been set up, and some pictures are available. Any request to text version of those pictures will give you the RDF description of the picture. I.e., an HTTP request for MIME type image/jpeg or image/* returns the photo, a request for text/rdf or text/* returns the metadata. Or you can just view the metadata by adding ";text/rdf" at the end of the pictures URI.

We plan to steadily increase the number of photos that are online.

8. Downloading the code

The Jigsaw extension and the JPEG related classes are a available in the Jigsaw 2.0.4 distribution, the metadata editor rdfpic will also be available.

An apparently very similar system to ours was developed by Jane Hunter and Zhimin Zhan [HunterZhan], but for the PNG image format and with PNG's built-in keyword/value format rather than RDF to express the metadata, although they use RDF to specify the metadata schemas. We plan to compare their schemas with ours in more detail in a future update of this Note.

The IPTC has a list of keywords for describing photo-journalistic images. Adobe Photoshop supports a subset of them.

The proposed DIG2000 [DIG2000] file format for the (also proposed) JPEG2000 [JPEG2000] image compression algorithm contains an XML-based metadata block with entries for people, places, events, GPS location, camera type, etc. It allows extensions with additional entries. The draft of October 1998 doesn't use RDF.

10. Acknowledgments

The rdfpic metadata editor has been written by Thierry Kormann (of Bull, France). Colas Nahaboo (also of Bull) has given valuable advise.

Janne Saarela (of Pro-Solutions, Finland) has written the original RDF schema from which the current schemas descend and has helped with checking and reviewing the schemas. His program SiRPAC has been a great help in checking and visualizing the schemas as well as the actual metadata generated by the metadata editor.

11. References

Dublin Core metadata initiative. Dublin Core metadata element set, version 1.1. July 1999. Dublin Core recommendation. URL: http://purl.oclc.org/docs/core/documents/rec-dces-19990702.htm
Digital Imaging Group. DIG2000 file format proposal. Oct 1998. Report (draft) ISO/IEC JTC1/SG29/WG1 N1017. URL: http://www.digitalimaging.org/pdf/wg1n1017.pdf
Fielding, Roy,; et. al. Hypertext Transfer Protocol - HTTP/1.1. June 1999. Internet RFC 2616. URL: ftp://ftp.isi.edu/in-notes/rfc2616.txt
Hunter, Jane; Zhan, Zhimin. "An Indexing and Querying System for Online Images Based on the PNG Format and Embedded Metadata" in: ARLIS/ANZ Conference. Sep 1999. Brisbane, Australia. URL: http://archive.dstc.edu.au/RDU/staff/jane-hunter/PNG/paper.html
Wolf, Misha; Wicksteed, Charles. Date and time formats. Sep 1997. Submission to W3C. URL: http://www.w3.org/TR/1998/NOTE-datetime-19980827
Hamilton, Eric. JPEG File Interchange Format. C-Cube Microsystems. Sep 1992. Milpitas, CA, USA. URL: http://www.w3.org/Graphics/JPEG/jfif3.pdf
Joint Photographers Expert Group (JPEG). Jpeg 2000 image coding system. 9 Dec 1999. Report (draft) ISO/IEC CD15444-1:1999. URL: http://www.jpeg.org/cd15444-1.pdf
Gosling, James; Joy, Bill; Steele, Guy. The Java language specification. Addison-Wesley. 1998. URL: http://java.sun.com/docs/books/jls/index.html
Jigsaw Team (Yves Lafon & Benoit Mahe). Jigsaw 2.0 internal design. July 1999. URL: http://www.w3.org/Jigsaw/Doc/Programmer/design.html
Lassila, Ora; Swick, Ralph R. (eds). Resource Description Framework (RDF) model and syntax specification. Feb 1999. W3C Recommendation. URL: http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/
Brickley, Dan; Guha, R. V.. Resource Description Framework (RDF) Schema Specification. 1999. W3C working draft. URL: http://www.w3.org/TR/1999/PR-rdf-schema-19990303/

Appendix A: The RDF schemas

The three schemas below (Dublin Core, technical and content) are machine-readable schemas in the syntax proposed by the RDF schemas draft [Schema].

The (modified) Dublin Core schema

The Dublin Core schema's official home is http://purl.org/DC/documents/rec-dces-19990702.htm (look at the source of that page), but the schema shown there is incorrect (as of December 1999). The schema below is a shortened version (the comments have been left out) and is hopefully error-free. We changed the human-readable labels to those we use in the metadata editor. The French translations of the labels are based on those by Anne-Marie Vercoustre.

    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" >

  <rdf:Property rdf:ID="title">
    <label xml:lang="en">Title</label>
    <label xml:lang="fr">Titre</label>
    <label xml:lang="nl">Titel</label>

  <rdf:Property rdf:ID="creator">
    <label xml:lang="en">Author/creator</label>
    <label xml:lang="fr">Auteur/créateur</label>
    <label xml:lang="nl">Auteur/maker</label>

  <rdf:Property rdf:ID="subject">
    <label xml:lang="en">Subject</label>
    <label xml:lang="fr">Sujet</label>
    <label xml:lang="nl">Onderwerp</label>
    <range rdf:resource="http://www.w3.org/PhotoRDF/content-1.0#Keywords"/>

  <rdf:Property rdf:ID="description">
    <label xml:lang="en">Description</label>
    <label xml:lang="fr">Description</label>
    <label xml:lang="nl">Beschrijving</label>

  <rdf:Property rdf:ID="publisher">
    <label xml:lang="en">Publisher</label>
    <label xml:lang="fr">Éditeur</label>
    <label xml:lang="nl">Uitgever</label>

  <rdf:Property rdf:ID="contributor">
    <label xml:lang="en">Contributor</label>
    <label xml:lang="fr">Contributeur</label>
    <label xml:lang="nl">Medewerker</label>

  <rdf:Property rdf:ID="date">
    <label xml:lang="en">Date</label>
    <label xml:lang="fr">Date</label>
    <label xml:lang="nl">Date</label>
    <!-- use http://www.w3.org/TR/NOTE-datetime
      format: YYYY[-MM[-DD[Thh:mm[:ss[.sTZD]]]]]
      example: 1999-10-01T17:53
      if TZD is omitted the timezone is UTC -->

  <rdf:Property rdf:ID="type">
    <label xml:lang="en">Resource type</label>
    <label xml:lang="fr">Type de ressource</label>
    <label xml:lang="en">Categorie</label>
    <!-- always "image in PhotoRDF -->

  <rdf:Property rdf:ID="format">
    <label xml:lang="en">Format</label>
    <label xml:lang="fr">Format</label>
    <label xml:lang="nl">Formaat</label>
    <!-- always "image/jpeg in PhotoRDF -->

  <rdf:Property rdf:ID="identifier">
    <label xml:lang="en">Number"</label>
    <label xml:lang="fr">Numéro"</label>
    <label xml:lang="nl">Nummer"</label>

  <rdf:Property rdf:ID="source">
    <!-- not used in PhotoRDF -->

  <rdf:Property rdf:ID="language">
    <!-- not used in PhotoRDF -->

  <rdf:Property rdf:ID="relation">
    <!-- not used in PhotoRDF -->

  <rdf:Property rdf:ID="coverage">
    <label xml:lang="en">Location</label>
    <label xml:lang="fr">Endroit</label>
    <label xml:lang="nl">Plaats</label>
    <!-- restricted to spatial coverage in PhotoRDF -->

  <rdf:Property rdf:ID="rights">
    <label xml:lang="en">Rights</label>
    <label xml:lang="fr">Droits</label>
    <label xml:lang="nl">Rechten</label>


The technical schema

See the description above for detailed explanations of each of the properties. The name of this schema is http://www.w3.org/2000/PhotoRDF/technical-1.0#


  <Class rdf:ID="Technical-data">
    <comment xml:lang="en">A class that represents technical
      data about a photo</comment>
    <comment xml:lang="fr">Une classe qui réprésente
      les dates techniques sur une photo</comment>
    <comment xml:lang="nl">Een class die de technische
      gegevens van een foto representeert.</comment>

  <rdf:Property rdf:ID="camera">
    <label xml:lang="en">Camera</label>
    <label xml:lang="fr">Camera</label>
    <label xml:lang="nl">Camera</label>
    <comment xml:lang="en">Brand and type of camera</comment>
    <comment xml:lang="fr">Marque et type de camera</comment>
    <comment xml:lang="nl">Cameramerk en -type</comment>
    <domain rdf:resource="#Technical-data"/>

  <rdf:Property rdf:ID="film">
    <label xml:lang="en">Film</label>
    <label xml:lang="fr">Film</label>
    <label xml:lang="nl">Film</label>
    <comment xml:lang="en">Brand and type of film</comment>
    <comment xml:lang="fr">Marque et type de film</comment>
    <comment xml:lang="nl">Filmmerk en -type</comment>
    <domain rdf:resource="#Technical-data"/>

  <rdf:Property rdf:ID="lens">
    <label xml:lang="en">Lens</label>
    <label xml:lang="fr">??</label>
    <label xml:lang="nl">Lens</label>
    <comment xml:lang="en">Brand and type of film.</comment>
    <comment xml:lang="fr">Marque et type de film.</comment>
    <comment xml:lang="nl">Filmmerk en -type.</comment>
    <domain rdf:resource="#Technical-data"/>

  <rdf:Property rdf:ID="devel-date">
    <label xml:lang="en">Development date</label>
    <label xml:lang="fr">Date de developpement</label>
    <label xml:lang="nl">Ontwikkeldatum</label>
    <comment xml:lang="en">Date on which the film was developed.</comment>
    <comment xml:lang="fr">Date a laquelle le film a été
    <comment xml:lang="nl">Datum waarop de film is ontwikkeld.</comment>
    <domain rdf:resource="#Technical-data"/>
    <!-- use http://www.w3.org/TR/NOTE-datetime
      format: YYYY[-MM[-DD[Thh:mm[:ss[.sTZD]]]]]
      example: 1999-10-01T17:53
      if TZD is omitted the timezone is UTC -->

  <!-- [more?] -->


The content schema

We left out the human-readable comments; see the descriptions of the keywords above. The name of this schema is: http://www.w3.org/2000/PhotoRDF/content-1.0#

  <!-- "" is the same as "http://www.w3.org/2000/PhotoRDF/content-1.0#" -->

  <Class rdf:ID="Keywords">
    <comment xml:lang="en">An enumeration of keywords to
      describe the subject of photos.</comment>
    <comment xml:lang="fr">Une enumeration de mots-clef
      pour decrire le sujet d'une photo.</comment>
    <comment xml:lang="nl">Een opsomming van sleutelwoorden
      om het onderwerp van foto's te beschrijven.</comment>

  <content:Keywords rdf:ID="Portrait">
    <label xml:lang="en">Portrait</label>
    <label xml:lang="fr">Portrait</label>
    <label xml:lang="nl">Portret</label>

  <content:Keywords rdf:ID="Group-portrait">
    <label xml:lang="en">Group portrait</label>
    <label xml:lang="fr">Portrait de groupe</label>
    <label xml:lang="nl">Groepsportret</label>

  <content:Keywords rdf:ID="Landscape">
    <label xml:lang="en">Landscape</label>
    <label xml:lang="fr">Paysage</label>
    <label xml:lang="nl">Landschap</label>

  <content:Keywords rdf:ID="Baby">
    <label xml:lang="en">Baby</label>
    <label xml:lang="fr">Bébé</label>
    <label xml:lang="nl">Baby</label>

  <content:Keywords rdf:ID="Architecture">
    <label xml:lang="en">Architecture</label>
    <label xml:lang="fr">Architecture</label>
    <label xml:lang="nl">Architectuur</label>

  <content:Keywords rdf:ID="Wedding">
    <label xml:lang="en">Wedding</label>
    <label xml:lang="fr">Mariage</label>
    <label xml:lang="nl">Trouwerij</label>

  <content:Keywords rdf:ID="Macro">
    <label xml:lang="en">Macro</label>
    <label xml:lang="fr">Macro</label>
    <label xml:lang="nl">Macro</label>

  <content:Keywords rdf:ID="Graphic">
    <label xml:lang="en">Graphic</label>
    <label xml:lang="fr">Graphique[?]</label>
    <label xml:lang="nl">Grafisch</label>

  <content:Keywords rdf:ID="Panorama">
    <label xml:lang="en">Panorama</label>
    <label xml:lang="fr">Panorama</label>
    <label xml:lang="nl">Panorama</label>

  <content:Keywords rdf:ID="Animal">
    <label xml:lang="en">Animal</label>
    <label xml:lang="fr">Animal</label>
    <label xml:lang="nl">Dier</label>


Appendix B: example of metadata

This is an example of the metadata in RDF format that is generated by rdfpic, and subsequently served by Jigsaw.

<?xml version="1.0"?>
  <rdf:Description about="">
    <DC:Coverage>Montredon-Labessoni&#233; (Tarn)</DC:Coverage>
    <DC:Relation>Marian in the Tarn</DC:Relation>
    <DC:Description>Marian brings the sheep to the field in the morning. The lamb she carries had been born that night.</DC:Description>
    <DC:Publisher rdf:resource="http://www.w3.org/People/Bos/"/>
    <DC:Title>Marian with sheep</DC:Title>
    <DC:Creator rdf:resource="http://www.w3.org/People/Bos/"/>
    <Technical:camera rdf:resource="http://www.w3.org/People/Bos/CanonEos"/>