W3C Signed RDF Workshop, A Position Paper.

Martin Lee, AND Data Ltd.

AND Data Ltd, Kings Mead House, Oxpens Road, Oxford OX1 1RX, UK.
tel. +44 1865 200 800
email: m.lee@andtech.co.uk.


Perspective.

AND Data's mission is to provide a language and knowledge framework that will improve communication between people by providing standard data within computer systems on a global level.

To this end AND is building a database of the English language with each sense of each word mapped to a classification scheme. By classifying words in such a manner we can remove ambiguity from word meanings to facilitate comprehension of documents, to aid machine processing and human understanding. In addition AND Data is compiling classified sets of proper nouns of people, places, organizations, and products for specific market segments and business applications. Using this data we can determine possible classifications for a document and identify the people, products etc. mentioned in the document. This meta-information we can express explicitly using RDF.

Why Signed RDF?

Communication is based on common understanding. For two people to communicate effectively they must share a common a vocabulary, share understandings and share trust (or at least be aware of each others trust worthiness) in what each other is saying.

The advent of electronic publishing, and especially the advent of the world wide web has introduced new facets to communication. Anonymous communication is facilitated, removing traditional social cues such as facial expression or voice tone. Barriers to publication have been lowered, manuscripts no longer have to pass through an editor, to a printing press, to a distribution network before being read, anyone can publish anything on the web without correction or review.

RDF goes someway towards expressing what a document is about, but on its own cannot express any level of trust about the document. Is the reported author really the author of the document? Who has generated the metadata referring to the document? Does the metadata refer to the document itself ? or has the document been modified?

Digital signature specifications, such as DSig, allow assertions about a document to be signed. Hence a keyholder signing a RDF document can be associated with the assertions expressed in the RDF that are made about a document. The element of trust, or at least the ability to judge a level of trust, has been introduced to the document, someone can be identified as being responsible for the assertions made.

Issues.

Keeping Flexibility.

One of RDF's great strengths is its flexibility. Metadata information can be expressed in multiple metadata vocabularies, the use of unique identifiers allows the mixing of different metadata vocabularies expressed within RDF to enrich the meta-information. Not only can further information be added to the document, allowing the inclusion of information that cannot be expressed by any single metadata vocabulary, but the use of multiple vocabularies maximises the chances of a vocabulary being recognised and supported by an application. One of the tenets of RDF was to allow different communities to use different vocabularies tailored to their own needs. The inclusion of metadata expressed in many vocabularies allows a document to be shared between communities who may normally only use one vocabulary.

Signed RDF should not restrict this flexibility.

Multiple levels of trust.

Put simply, 'who said what'.

One method for keeping the flexibility of RDF would be to wrap up the whole of the RDF written by one keyholder about a document and sign it. In some circumstances an assertion by the author of a document about it's content may be a sufficient level of trust. This is very similar to non-signed RDF, an author writes a page, generates the metadata themselves and publishes the document. In digitally signing the RDF, the author is stating they are responsible for the metadata content, the major difference between this and un-signed RDF is that the author (or keyholder) can be unambiguously identified by their digital signature.

In many cases additional levels of trust may be required. A second keyholder may review a document and generate their own metadata relating to the document, quite separate from the author's. It may be sufficient for a completely separate RDF text to be written, signed by the second keyholder and associated with the author's signed metadata and the document.

However if a second keyholder has reviewed the metadata and is only willing (or able) to sign part of the author's metadata, e.g. signing that the claimed author is genuinely who they claim to be. It would be useful for the signed RDF to consist of a block of:

Keyholder A makes assertions 1-10 about the document.
Keyholder B agrees with assertions 1,7 and signs these in addition to Keyholder A.

Instead of multiple blocks of:

Keyholder A makes assertions 1-10 about a document.
Keyholder B makes assertions 11-12 about a document.
Where keyholder B's assertions co-incide with assertions 1,7 of keyholder A, but this is not explicitly marked.

Dynamic Documents and Assuring the Right Metadata is With the Right Document.

The signed RDF must be assured of relating to the document in the state it was when it was signed. Otherwise the signed assertions do not promote trust since they become 'the keyholder believes the document was like this when the assertions was signed, but the document may have changed completely since then.' Mechanisms such as the DSig common manifest format allow a document and related RDF to be attached together, and allow the signing of RDF relating to dynamic documents.

However it should be remembered that the document may be static and the RDF dynamic, such as a list of people who have read and approved a document being expressed in RDF and updated each time the document is accessed. Signed RDF should be able to cope with this.

Compatibility

While the needs of signed metadata are unique, whatever system is adopted for signed RDF should be broadly compatible and use similar techniques to other W3C projects such as signed XML, signed PICS labels and DSig.

Expectations.

From this workshop I would like to gather opinions on what place RDF and signed RDF are going to have in the real world, detail the issues that signed RDF must address and detail the constraints on the format. I would like to see the result of this workshop being an outline of a signed RDF working draft.