Warning:
This wiki has been archived and is now read-only.

RDFwithContexts

From RDF Working Group Wiki
Jump to: navigation, search

RDFC: RDF with Contexts

This is a proposal to extend RDF with contexts, giving users the ability to create, link and use contexts to modify the intended meaning of RDF graphs. A context is intended to represent a social agreement about the intended meaning of a given vocabulary of IRIs. This proposal grew from an attempt to reconcile the 2004 RDF semantics with Antoine Zimmerman's proposed semantics for SPARQL datasets, so it can be considered to be an alternative proposal to that.

Background

The 2004 RDF specs are based upon a vision in which IRIs (there called URI references) are globally unique and do not change their meaning with time, so that every token of a given IRI has exactly the same meaning. (From RDF Semantics 2004, section 1.2: "This document does not take any position on the way that URI references may be composed from other expressions, e.g. from relative URIs or QNames; the semantics simply assumes that such lexical issues have been resolved in some way that is globally coherent, so that a single URI reference can be taken to have the same meaning wherever it occurs. Similarly, the semantics has no special provision for tracking temporal changes. It assumes, implicitly, that URI references have the same meaning whenever they occur.") Call this the global hypothesis.

In marked contrast, many current RDF users and developers treat RDF content and the IRIs it contains as belonging to an "island" of accepted meanings, where the meaning of an IRI might change between different "islands" (this term introduced by Andy Seaborne). In this spirit, Antoine Zimmerman's proposed semantics for RDF datasets treats each named graph in a dataset as isolated in meaning from all the others, so that the graph name defines a local context in which a given URI might mean something different from the same IRI in a different named graph; and this situation is seen as being globally satisfiable. Call this the local hypothesis.

Both views have their adherents and there are good arguments for both positions, but they seem incompatible. One, rather weak, resolution is to say that the global hypothesis applies to RDF graphs and the local to SPARQL datasets, but this fails to account for the apparent clash of intuitions and provides no account of the relationship between datasets and graphs. Moreover, if the arguments for the local hypothesis are sound, then they apply to RDF in the large.

Two current ideas

1. Working within the global assumption, Sandro has suggested a way to distinguish graph names from mere graph labels in a SPARQL dataset, using the class rdf:Graph. This allows URIs to be used as graph labels without necessarily denoting the graph itself. We will use this below.

2. Capturing an extreme version of the localist hypothesis, Antoine has suggested a model theory for SPARQL datasets which extends the RDF semantics with a mapping con from the vocabulary to RDF interpretations. We will adapt this idea, slightly modified and generalized, to give a semantics for a more general notion of RDF contexts.

Overview of proposal

We extend RDF with the idea of 'web contexts'. A context represents a social agreement about the intended meaning of some vocabulary of IRIs, called the reserved vocabulary of the context. Each context has an associated vocabulary of IRIs and a semantic constraint on this vocabulary, which formally is a set of RDF interpretations of the reserved vocabulary.

RDF graphs are always asserted in a context. To assert a graph in a context is to make a public declaration that one is conforming to the semantic constraints on the reserved vocabulary.

Contexts are related by inheritance, which behaves very similarly to OWL imports. One context A inherits another B when A's reserved vocabulary includes that of B, and A imposes at least the same semantic conditions on B's vocabulary as B does. (This can be stated more mathematically as a condition on interpretations, but that belongs in an appendix.) Multiple inheritance is allowed. There is a 'top' context called the RDF context which has the RDF namespace as its reserved vocabulary and imposes the semantic constraints defined by the 2004 specs. All contexts inherit the RDF context. Similarly there are the RDFS, OWL-DL/RDF, OWL-Full/RDF, etc. contexts. Users can also define new contexts related to these.


Syntax

To assert a graph in a context C, just include a triple

<> rdf:inherits C

into the graph (compare owl:imports.) The RDF context is a default, so that if a graph is asserted without any context specified, it is understood to be asserted in the RDF context. This preserves the meanings of all current RDF graphs.

Anything can be used as a context. Being a context is a role rather than a classification. So an IRI might denote, say, a person or a document or a galaxy, and still be used to indicate a context. It is not required to define the reserved vocabulary and semantic constraints in order to use a context, but the utility of a context is increased by supplying more detail. The recommended way to do this is to create a document describing the vocabulary and semantic constraints, create an IRI which retrieves this document by HTTP, and use this IRI to indicate the context. The document may be more or less formal, and we do not restrict the format used in the specifying document, but if this document includes (a textual rendering of) an RDF graph, then it is called the context graph, and the reserved vocabulary of the context is the non-reserved vocabulary of this graph and the semantic constraints of the context include at least the condition that this graph is true in every interpretation. Inheritance of other contexts can be indicated by including triples of the form <> rdf:inherits C in the context graph.

The special case where the entire defining document is an RDF graph, so that the context is completely defined by this graph, is called a graph context. Inheriting a graph context is semantically identical to importing the context graph of the context. Not all contexts can be defined as graph contexts, however (eg the RDF context is not a graph context.)

Inheritance between contexts, and assertion of a graph in a context, are determined by the property rdf:inherits, which is part of the RDF namespace and hence its meaning is determined by the top RDF context, so it cannot be changed by any other context. This means that the overall structure of contexts is global and not itself context-sensitive. Inheritance defines a directed acyclic graph structure on contexts.

Semantics

An RDFC interpretation I of a vocabulary V is an RDF interpretation IJ of V together with a mapping con from the universe UJ of IJ to the set of RDF interpretations on subsets of V to subsets of UJ. Define voc(x) to be the vocabulary of con(x).

A triple sss rdf:inherits ooo is true in I just when voc(IJ(ooo)) is the restricted vocabulary specified for a context indicated by the IRI ooo and con(IJ(sss)) satisfies the semantic conditions specified for the context indicated by the IRI ooo.

For a URI uuu in the context C, I(uuu) = con(C)(uuu) if uuu is in voc(C), otherwise I(uuu) = IJ(uuu) .

The other semantic conditions for triples, graphs, blank nodes, etc.., are exactly as in the 2004 RDF semantics.


open question

What about IRIs defining datatypes? Maybe they should be excluded from the context machinery? We could have a datatype context just under the RDF top context, or include XML Schema in the top context.

Comments

The vocabulary of a context may be identifiable as a namespace, but this is not required; and the semantic constraints imposed by the context may be defined by an RDF graph, but this is also not required. In fact, it is not actually required that any definition be given of the intended semantics of a context, in which case the context IRI serves simply as a kind of public flag of mutual vocabulary agreement between a number of RDF graphs.

Context inheritance allows semantic conditions on a vocabulary to be applied incrementally. For example, a context may declare that the IRI ex:person must denote a human being. Another context may inherit that, but add the contraint that this same IRI must denote a living person; still a third context may inherit the second and require that ex:person denote a living US citizen. In this way, a single IRI may serve a variety of distinct, but related, uses to indicate a variety of shades or specializations of meaning. Two RDF sources which discover that they both use a given IRI with subtly different meanings can create new contexts defining the differences in meaning and agree to inherit these different contexts, which is a much simpler change than re-writing sets of data to avoid a terminology clash.

Global and graph-local extreme cases

RDF which does not use rdf:inherits remains identical in meaning to that specified in the 2004 specs, with the globalist assumption about the IRIs in such graphs.

Asserting a graph in itself as a context basically isolates the IRIs in the graph from other occurrences of the same IRIs elsewhere. This captures the extreme localist view of IRI meanings. In particular, a dataset in which graph labels denote the graphs, and those graphs are asserted in themselves as a context, have the same meaning as Antoine's proposed semantics for datasets. Using Sandro's naming trick, this comes out as:

{ <u1> a rdf:Graph }

<u1> { <u1> rdf:inherits <u1> ...}


We could call this a solipsist graph.

Some valid entailments

To make this easier, write [G in C] to mean the graph got by adding the triple <> rdf:inherits C to the graph G, and [G & H] to mean the merge of G and H


1. Graphs asserted in the same contexts can be merged. (They are on the same "island".)


2. More generally, [G1 in C] , [G2 in D] , D rdf:inherits C

entail

[G1 & G2] in D

But not in C (since D might impose extra semantic conditions in G2 which do not apply in C), and not without the D-C inheritance (since G1 might not be transferrable from C to D).


3. { <u2> a rdf:Graph , <u3> a rdf:Graph }

<u1> { <a> <b> <c> }

<u2> { <d> <e> <f> )

<u3> { <u3> rdf:inherits <u3> , <g> <h> <i> }


entails


<u2> { <d> <e> <f> }


but not <u1> { <a> <b> <c>} because <u1> is a mere label, not a graph name;

and not <u3> { <g> <h> <i> } because <u3> is asserted in its own local context and so its URII have no meaning outside that graph.