Metadata
Activity Statement

Work on Metadata is part of W3C's Technology and Society Domain.

Note: The W3C Metadata Activity was replaced with the W3C Semantic Web Activity when the latter was chartered in February, 2001. The "future" work described herein (last updated in November 2000) is obsoleted by the Semantic Web Activity Statement.

  1. Introduction
  2. Role of W3C
  3. Current Situation
  4. What the future holds

Introduction

There is now a wealth of information on every subject available on the Net. For many, however, the true excitement of the Web is in the services that you can access from your home or office. Today's Web gives people access to news, to the weather and to financial services. Via the Web, users can purchase books, computers, clothes, and any number of other items; you can book seats on planes and rooms in hotels. The possible uses of the Web seem endless, but there the technology is missing a crucial piece. Missing is a part of the Web which contains information about information - labeling, cataloging and descriptive information structured in such a way that allows Web pages to be properly searched and processed in particular by computer. In other words, what is now very much needed on the Web is metadata. W3C's Metadata Activity is concerned with ways to model and encode metadata. A particular priority of W3C is to use the Web to document the meaning of the metadata. Our strong interest in metadata has prompted development of the Resource Description Framework (RDF ) and its relative PICS (Platform for Internet Content Selection). PICS is now complete; work on RDF continues.

PICS: Platform for Internet Content Selection

PICS consists of a suite of specifications which enable people to distribute metadata about the content of digital material in the form of "labels". These contain information about the content in simple, computer-readable form. Information can be given a label, which computers can then process in the background according to settings previously specified by the user, filtering out undesirable material or directing users to sites that may be of special interest to them. While PICS has general applicability to labelling pages for a variety of metadata purposes, the PICS specification was originally designed to allow parents and teachers to screen out materials unsuitable for children using the Internet. Rather than simply censoring the information itself, as various legislative bodies have suggested, PICS gives responsibility to users to control personally, or to delegate control of, what they receive on their browsers.

The Resource Description Framework - RDF

PICS work led to the development of the Resource Description Framework (RDF), which provides a more general treatment of metadata. RDF is a declarative language and provides a standard way for using XML to represent metadata in the form of statements about properties and relationships of items on the Web. Such items, known as resources, can be almost anything, provided it has a Web address. This means that you can associate metadata with a Web page, a graphic, an audio file, a movie clip, and so on.

Simple explanation of concepts

A simple example of RDF

This is a very simple example of RDF to give a feel of the way it works, and to demonstrate in a basic way the related concepts of RDF schemas and XML namespaces.

<RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:dc="http://purl.org/dc/elements/1.1/">
<Description about="http://www.w3.org/Press/99Folio.pdf">
<dc:title>The W3C Folio 1999</dc:title>
<dc:creator>W3C Communications Team</dc:creator>
<dc:date>1999-03-10</dc:date>
<dc:subject>Web development, World Wide Web
Consortium, Interoperability of the Web</dc:subject>
</Description>
</RDF> 

In this example, RDF has been used to express data about the W3C Folio, the Consortium's Prospectus available on-line on the W3C site. The basic concept is that metadata about this item on the Web is described through a collection of properties called an RDF Description. Notice that RDF uses the familiar arrangements of brackets, backslashes, tag names, attributes, and other elements of syntax which are part and parcel of XML. RDF is indeed, written in XML.

RDF Schemas

RDF provides a framework in which independent communities can develop vocabularies that suit their specific needs and share vocabularies with other communities. In order to share vocabularies, the meaning of the terms must be spelled out in detail. The descriptions of these vocabulary sets are called RDF Schemas. A schema defines the meaning, characteristics, and relationships of a set of properties, and this may include constraints on potential values and the inheritance of properties from other schemas. The RDF language allows each document containing metadata to clarify which vocabulary is being used by assigning each vocabulary a Web address. The schema specification language is a declarative representation language influenced by ideas from knowledge representation (e.g. semantic nets, frames, predicate logic) as well as database schema specification languages and graph data models.

One of the best-known schemas is the Dublin Core, invented by the library community (their first meeting was in Dublin, Ohio, USA). Other schemas might deal with quite different domains. For an application on "English Pubs", properties might relate to "location", "food quality", "star rating", "nearby attractions" and so on. But what happens if two applications use the same tag names? This is where XML namespaces become important.

XML namespaces in the context of RDF

RDF uses the idea of the XML namespace to effectively allow RDF statements to reference a particular RDF vocabulary or "schema". Bear in mind that two applications might adopt the same headings and categories when it comes to organizing material. Perhaps the property address is used to mean a company location in one application, and a company's Web address in another. Potential conflicts are resolved because, through various programming mechanisms, a tag for a property name can use a short code which signals to which specific application vocabulary that tag "belongs". The "Namespaces in XML" specification describes such mechanisms in detail and is useful not only in the context of RDF but for many other XML applications also.

Practical uses of RDF

There are many practical uses of RDF. Here is a sampling of what is likely to be in the pipeline.

The Role of W3C

W3C has created the RDF specifications as a framework for application-specific vocabularies. We have links with the Dublin Core Workshop series. The Dublin Core is an attempt to define bibliographic categories for Web pages. W3C expects to coordinate work on specific vocabularies only when they cross several application domains.

Working groups

The RDF Syntax Working Group defined the RDF data model and selected the RDF/XML syntax. The RDF Schema Working Group developed a vocabulary to specify the sets of vocabularies specific to each application. The RDF Model and Syntax Working Group has completed its deliverable and has adjourned. The RDF Schema Working Group has adjourned pending implementation feedback on its Candidate Recommendation for the RDF Schema description language. An RDF Interest Group has been established to provide a forum for developers who are using the RDF specifications and to provide a locus for establishing critical mass for further work in the metadata area. The The Metadata Coordination Group is the forum in which dependencies on and from other activities such as XML and P3P are managed.

Current Situation

Resource Description Framework (RDF)

The Resource Description Framework Model and Syntax Specification became a W3C Recommendation on 22nd February, 1999. Meanwhile the Resource Description Framework (RDF) Schema Specification 1.0 was released as a W3C Candidate Recommendation on 27 March, 2000.

Member review of the RDF Schema Proposed Recommendation, while supporting the release of RDF Schema as a Recommendation, brought some feedback that a closer connection with, and possibly a merger into, the XML Schema work should be considered. A technical meeting was held in August, 1999 to consider this question. The results of that meeting were published as a W3C Note in October, 1999.

Platform for Internet Content Selection (PICS)

The PICS technical specifications remain stable W3C Recommendations. These specifications include the PICS Label Distribution Label Syntax and Communication Protocols Recommendation, the PICS Rating Services and Rating Systems Recommendation, the PICSRules Recommendation for writing filtering profiles, and the PICS Signed Labels (DSig) Recommendation.

What the Future Holds

The RDF Interest Group is the primary forum in which W3C Members and the public will discuss specific ideas for future work and gauge when there is sufficient interest to make a formal proposal for a new Working Group. Possible work items include: a rule language for augmenting client-side scripting facilities that acts on RDF metadata, a search (resource discovery) protocol based on RDF, and a metadata query protocol based on RDF. Specific application metadata vocabularies are generally not within the scope of a W3C activity, however W3C is open to work proposals for applications such as Privacy and Mobile Access which may be characterized as necessary for the infrastructure of the Web.

Contacts

The contact for the Metadata Activity is Ralph Swick


Valid HTML 4.0!

Last modified $Date: 2002/08/23 16:56:51 $

Copyright  ©  1998-2000 W3C (MIT, INRIA, Keio ), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.