RDFwithContexts

From RDF Working Group Wiki
Revision as of 06:58, 16 April 2012 by Phayes3 (Talk | contribs)

Jump to: navigation, search

RDFC: RDF with Contexts

This is a proposal to extend RDF with contexts, giving users the ability to create, link and use contexts to modify the intended meaning of RDF graphs. A context is intended to represent a social agreement about the intended meaning of a given vocabulary of IRIs. This proposal grew from an attempt to reconcile the 2004 RDF semantics with Antoine Zimmerman's proposed semantics for SPARQL datasets, so it can be considered to be an alternative proposal to that.

Background

The 2004 RDF specs are based upon a vision in which IRIs (there called URI references) are globally unique and do not change their meaning with time, so that every token of a given IRI has exactly the same meaning. (From RDF Semantics 2004, section 1.2: "This document does not take any position on the way that URI references may be composed from other expressions, e.g. from relative URIs or QNames; the semantics simply assumes that such lexical issues have been resolved in some way that is globally coherent, so that a single URI reference can be taken to have the same meaning wherever it occurs. Similarly, the semantics has no special provision for tracking temporal changes. It assumes, implicitly, that URI references have the same meaning whenever they occur.") Call this the global hypothesis.

In marked contrast, many current RDF users and developers treat RDF content and the IRIs it contains as belonging to an "island" of accepted meanings, where the meaning of an IRI might change between different "islands" (This term introduced by Andy Seaborne.) In this spirit, Antoine Zimmerman's proposed semantics for RDF datasets described here treats each named graph in a dataset as isolated in meaning from all the others, so that the graph name defines a local context in which a given URI might mean something different from the same IRI in a different named graph; and this situation is seen as being globally satisfiable. Call this the 'local hypothesis'.

Both views have their adherents and there are good arguments for both positions, but they seem incompatible. One, rather weak, resolution is to say that the global hypothesis applies to RDF graphs and the local to SPARQL datasets, but this fails to account for the apparent clash of intuitions and provides no account of the relationship between datasets and graphs. Moreover, if the arguments for the local hypothesis are sound, then they apply to RDF in the large.

Two current ideas

1. Working within the global assumption, Sandro has suggested a way to distinguish graph names from mere graph labels in a SPARQL dataset, using the class rdf:Graph. This allows URIs to be used as graph labels without necessarily denoting the graph itself. We will use this below.

2. Capturing an extreme version of the localist hypothesis, Antoine has suggested a model theory for SPARQL datasets which extends the RDF semantics with a mapping con from the vocabulary to RDF interpretations. We will adapt this idea, slightly modified and generalized, to give a semantics for a more general notion of RDF contexts.

Overview of proposal

We extend RDF with the idea of 'web contexts'. A context represents a social agreement about the intended meaning of some vocabulary of IRIs, called the reserved vocabulary of the context. Each context has an associated vocabulary of IRIs and a semantic constraint on this vocabulary, which formally is a set of RDF interpretations of the reserved vocabulary.

RDF graphs are always asserted in a context. To assert a graph in a context is to make a public declaration that one is conforming to the semantic constraints on the reserved vocabulary.

Contexts are related by inheritance, which behaves very similarly to OWL imports. One context A inherits another B when A's reserved vocabulary includes that of B, and A imposes at least the same semantic conditions on B's vocabulary as B does. (This can be stated more mathematically as a condition on interpretations, but that belongs in an appendix.) Multiple inheritance is allowed. There is a 'top' context called the RDF context which has the RDF namespace as its reserved vocabulary and imposes the semantic constraints defined by the 2004 specs. All contexts inherit the RDF context. Similarly there are the RDFS, OWL-DL/RDF, OWL-Full/RDF, etc. contexts. Users can also define new contexts related to these.


Syntax

To assert a graph in a context C, just include a triple

<> rdf:inherits C

into the graph (compare owl:imports.) The RDF context is a default, so that if a graph is asserted without any context specified, it is understood to be asserted in the RDF context. This preserves the meanings of all current RDF graphs.

Anything can be used as a context. Being a context is a role rather than a classification. So an IRI might denote, say, a person or a document or a galaxy, and still be used to indicate a context. It is not required to define the reserved vocabulary and semantic constraints in order to use a context, but the utility of a context is increased by supplying more detail. The recommended way to do this is to create a document describing the vocabulary and semantic constraints, create an IRI which retrieves this document by HTTP, and use this IRI to indicate the context. The document may be more or less formal, and we do not restrict the format used in the specifying document, but if this document includes (a textual rendering of) an RDF graph, then it is called the context graph, and the reserved vocabulary of the context is the non-reserved vocabulary of this graph and the semantic constraints of the context include at least the condition that this graph is true in every interpretation. Inheritance of other contexts can be indicated by including triples of the form <> rdf:inherits C in the context graph.

The special case where the entire defining document is an RDF graph, so that the context is completely defined by this graph, is called a graph context. Inheriting a graph context is semantically identical to importing the context graph of the context. Not all contexts can be defined as graph contexts, however (eg the RDF context is not a graph context.)

Inheritance between contexts and assertion of a graph in a context is all determined by the property rdf:inherits, which is part of the RDF namespace and hence its menaing is determined by the top RDF context, so cannot be changed by any other context. This means that the overall structure of contexts is global and not itself context-sensitive. Inheritance defines a directed acyclic graph structure on contexts.


Semantics

An RDFC interpretation I of a vocabulary V is an RDF interpretation J of V together with a mapping con from the universe UJ of J to the set of RDF interpretations on subsets of V to subsets of UJ. Define voc(x) to be the vocabulary of con(x).

A triple sss rdf:inherits ooo is true in I just when voc(I(ooo)) is the restricted vocabulary specified for a context indicated by the URI ooo and con(I(sss)) satisfies the semantic conditions specified for the context indicated by the URI ooo.

For a URI uuu in the context C, I(uuu) = con(C)(uuu) if uuu is in voc(C), otherwise I(uuu) = J(uuu) .

The other semantic conditions for triples, graphs, blank nodes, etc.., are exactly as in the 2004 RDF semantics.

Extra Stuff

The vocabulary of a context may be identifiable as a namespace, but this is not required; and the semantic constraints imposed by the context may be defined by an RDF graph, but this is also not required. In fact, it is not actually required that any definition be given of the intended semantics of a context, in which case the context IRI serves simply as a kind of public flag of mutual vocabulary agreement between a number of RDF graphs. (The global hypothesis perspective could be stated as the idea that there is a single global 'flag' context of this kind, with no definition. At the other extreme, an RDF graph may itself be considered to be a context which defines the semantic constraints of all the (non-RDF) IRIs which occur in it; and if asserted in itself, then that graph is effectively isolated in meaning from the rest of the Web. An extreme version of the local hypothesis could be stated as the idea that every RDF graph is asserted in itself as its own context.)

Being a context is not a classification: anything can be treated as a context. What an IRI denotes or refers to need not be the context itself, even though it is used to indicate a context. For example, an IRI denoting a person might be used to indicate a context of all facts that are relevant to that person's life; or a literal denoting a time/date might be used to indicate a context of facts true at the time. However, it is good practice to coin a unique IRI which actually denotes a document which can be http-got from the IRI and which defines the vocabulary of the context and the semantic constraints it imposes on that vocabulary. The document IRI can then serve to indicate the context. For example, we could use the IRI "http://www.w3.org/TR/rdf-concepts/" to indicate the top RDF context.


Contexts are related by inheritance, which behaves very similarly to OWL imports (and in fact coincides with it when the contexts are represented as RDF graphs.) Backward compatibility with current RDF is provided by assuming a default "top context" which is inherited by all other contexts and represents an agreement to treat the vocabulary of the RDF namespace as defined by the RDF 2004 semantics. Context inheritance allows semantic conditions on a vocabulary to be applied incrementally. For example, a context may declare that the IRI ex:person must denote a human being. Another context may inherit that, but add the contraint that this same IRI must denote a living person; still a third context may inherit the second and require that ex:person denote a living US citizen. In this way, a single IRI may serve a variety of distinct, but related, uses as indicating a variety of shades or specializations of meaning. Dually, two RDF sources which discover that they both use a given IRI with subtly different meanings can create new contexts defining the difference in meaning and agree to inherit these different contexts, which perhaps both inherit from a more global context which has a more generic meaning for that IRI. We allow multiple inheritance, so the inheritance structure is a directed acyclic graph rather than a tree; and this structure is global, which is ensured by an RDF semantic restriction on the vocabulary which states the inheritance relationship, stated in the globally 'top' RDF context.

Although this is a proposal to change RDF, we will refer to it as RDFC in what follows, to avoid confusion with the 2004 RDF specs.

RDFC syntax

To use an IRI to indicate a context, simply put that IRI in a context-indicating position.

The most obvious context-indicating position is as a "graph name" in a TriG document, thus:

{ <context> { <graph> }}

In RDFC this is understood to be an assertion of the graph in the context. This differs from a simple assertion of the graph because the context can fix the meanings of some of the vocabulary in the graph. (Formal semantics are given below.)

Graphs asserted in the same context can be merged, just as specified by the 2004 semantics. (This a special case of a more general rule given later.)

{ :con { :a :b :c . }}

{ :con { :d :e :f . }}

together entail

{ :con { :a :b :c .  :c :d :e . }}

But

{ :con1 { :a :b :c . }}

{ :con2 { :d :e :f . }}

have no nontrivial entailments. (Contrast with Sandro's proposal.)

Note that in RDFC, the "graph name" in a SPARQL datastore indicates a context. There is no blanket assumption that a "graph name" actually names a graph. (However, it CAN name a graph: see below.)