Difference between revisions of "AnotherSpin"

From RDF Working Group Wiki
Jump to: navigation, search
(Semantics)
(Examples)
Line 74: Line 74:
 
I actually think this is better, because allowing equality and class reasoning to apply to extensions could get things into an indescribable tangle where what IRIs mean depends upon obscure OWL reasoning.
 
I actually think this is better, because allowing equality and class reasoning to apply to extensions could get things into an indescribable tangle where what IRIs mean depends upon obscure OWL reasoning.
  
==Examples==  
+
==Why bother?==
 +
 
 +
One might ask, why bother? Since the effect of a semantic extension can be achieved by defining a vocabulary of IRIs and specifying what it is intended to mean, more or less formally, just as we do now. There are several responses to this.
 +
 
 +
First, the suggested machinery can be viewed simply as supplying some normative discipline to this existing practice, giving a standard way to refer to the defining documents for such a namespace and to allow explicit linking to the important semantic sources rather than relying on (what some may consider to be a mis-use of) the IRI de-hashing convention. It also allows for a decoupling of RDF semantic extension vocabularies from the HTTP/XML namespace conventions, since the reserved vocabulary can be any set of IRIs.
 +
 
 +
Second, this machinery can be used to go beyond the ontology-plus-namespace convention, by allowing for hierarchies of extensions to build upon one another in a series of semantic refinements, without needing to invent a whole new vocabulary. Suppose for example an ontology is published, and used, which defines a set of concepts in chemistry, but does not provide the notion of isotope. To use this ontology in a context in which isotopes – say, carbon-12 and carbon-14 – are distinguished, might require re-writing it to change the sense of "element", and this re-writing requires inventing an entire new namespace, which then needs to be related to the previous one, probably by mis-using owl:sameAs. With the current machinery, however, one could define an extension which introduces the concept of isotopes, re-defines chemical element to be the union of isotope classes, and retains the old terminology unchanged. Data can then be transferred to this new ontology simply by changing one inheritance triple in a large graph, with no internal re-writing of data required. Similar advantages accrue when ontologies are changed or updated; the old terminology can be re-used with a new semantic inheritance, even though a strict adherence to the "cool URIs" doctrine would require that the data be re-formulated using a new namespace, to track the change in meaning. (There is an existing use case. The semantic extension machinery is very similar to the 'microtheory' machinery developed by Guha and used extensively and successfully in the large-scale CYC ontology, largely in the way just outlined, to allow context-dependent refinements of meaning applied to a single vocabulary. See [http://www.cyc.com/doc/context-space.pdf here] for an extended description. For "context" read "extension".)
 +
 
 +
The "Cool URI" idea can be re-stated here as "Cool extensions". What the RDF Web needs in order to be stable and reliable is not stable URIs as such, but stable  URIs-in-semantic-extensions; and for this to be possible, semantic extensions must be stable and reliably linked to IRIs. This will require a certain discipline to be adopted in coining and especially modifying semantic extension defining documents. Rather than changing the definitions of a semantic extensions, a new extension should be defined and linked to the older one. If possible, it should be an extension of the older version, but if this is not possible (as it often will not be) there should be some way to find the current version of any extension from any older ones. <At this point Pat is reduced to handwaving.>
 +
 
 +
==examples==
  
 
Time-dependent properties and intervals.
 
Time-dependent properties and intervals.

Revision as of 19:02, 22 April 2012

Another Spin on RDF with Contexts

OK, describing the RDFC idea as finding a compromise between global and local views of IRI meanings, and as a revision of RDF itself, has raised some hackles and concerns. This Wiki page is a different way to spin essentially the same idea, making it sound a lot less threatening (I hope). HOwever, apart from the wording used to describe it, this is exactly the same proposal.

We don't say we are redefining RDF, and we don't talk about RDFC. We don't use the term 'local' or talk about changing the meanings of IRIs. Instead, we talk about extending RDF with an option that allows users to describe and name their own semantic extensions, so that they and others can use them to express more things in RDF. Its a kind of do-it-yourself RDF-extending kit.

The idea of semantic extension is already introduced and sketchily defined in the 2004 specs, the relevant text being "Particular uses of RDF, including as a basis for more expressive languages such as DAML+OIL [DAML] and OWL [OWL], may impose further semantic conditions in addition to those described here, and such extra semantic conditions can also be imposed on the meanings of terms in particular RDF vocabularies. Extensions or dialects of RDF which are obtained by imposing such extra semantic conditions may be referred to as semantic extensions of RDF. " So the proposal is now to take this idea and run a little further with it.

Although the terminology has changed, the formal machinery and its intended uses are almost exactly as in the earlier proposal, so I have kept the same IRI names for comparison. However, if this way of describing it is preferred, then we might want to change rdf:inherits to something like rdf:inExtension.

Semantic Extensions to RDF

A semantic extension, or simply an extension, to RDF represents a named public agreement to use a particular vocabulary of IRIs, called the reserved vocabulary of the extension, with a particular meaning defined by the extension. Semantic extensions must not violate the basic semantics of RDF, but they can extend it by imposing special meanings on IRIs. The OWL and RDFS standards are semantic extensions to RDF defined by their W3C specifications documents. (In these cases the reserved vocabulary constitutes a namespace, but this is not required.) Users may invent their own extensions and indicate them by an IRI. Then, including a triple

<> rdf:inherits C .

in an RDF graph, where C indicates the semantic extension, means that C-reserved IRIs which occur in the rest of the graph are to be interpreted using the semantic constraints of that extension. Several such triples may be included in a graph, in which case all the indicated extensions apply, each to its own reserved vocabulary. When a graph includes one or more rdf:inherits triples, we will say that the graph inherits the semantic extensions, and that any URIs in the graph which are not part of the reserved vocabulary of any inherited extensions are unreserved.

(***This presumes that there is a clear notion of a graph boundary, which is not true in current (2004) RDF, but is widely presumed. This is something we need to fix.***)

Note, we use the term 'indicates' because the indicating URI might denote something else. In general, any IRI may be used to indicate a semantic extension, but it does not therefore denote or refer to or identify that extension. Semantic extensions are not considered to be RDF resources; they are not things in the universe of discourse of RDF. An IRI indicates a semantic extension simply by being used as the object of an rdf:inherits triple. The recommended practice is to have the IRI which is used to indicate an extension identify a document which defines the reserved vocabulary and semantic conditions of the extension; but any IRI, for example one which identifies a human person or an OWL class, may be used to indicate an extension.

(***This is to allow current practice where an IRI is used to 'label' an RDF graph in a dataset while also being used to denote something else. Several major RDF supporters use this, so we need to allow it.***)

It is possible that two extensions might impose inconsistent conditions on the same reserved vocabulary, so that a graph which inherits both of them cannot be satisfied. In this case the graph is considered to be RDF-inconsistent, and this situation may be flagged as an error.

A description of a semantic extension must define the restricted vocabulary of the extension, so that there is an algorithm which will determine, for any IRI, whether or not it is in the vocabulary; and may describe the intended meanings of these IRIs in a form which defines a set of RDF interpretations on the vocabulary as being those which satisfy the semantic conditions of the extension. A semantic extension may specify certain syntactic conditions on RDF graphs or combinations of RDF triples; in which case an algorithm must be defined which can determine, for any RDF graph, whether or not it satisfies these syntactic constraints of the extension. (*** Thinking of OWL-DL, obviously. ***)

A (recommended) way to define the reserved vocabulary is to include an RDF graph in the documentation which uses all the IRIs in the reserved vocabulary, called the context graph. This context graph may also represent some of the semantic conditions on the vocabulary. However, semantic conditions may be defined in any way, for example by stating conditions on interpretations directly, by providing axioms or rules, or in natural-language text. They may be omitted altogether, but of course the less information that is provided about the semantic conditions, the less useful the extension may be. We leave this situation under-defined deliberately, in anticipation of a scenario where a relatively under-defined extension gradually becomes clearer as it gets used by a community.

(***I believe that it is important to keep this under-defined, allowing informally expressed descriptions of extensions, to make it very easy for people to use extensions even if their definitions are informal. It also provides a semi-official way to connect RDF semantic content to what used to be called 'social meaning'. For example, the text might specify that a given class name IRI is interpreted to mean the class of all adult human beings, without giving any further explanation or being obliged to 'axiomatize' this idea.***)

A semantic extension which does not define its semantic constraints may be used as a public flag to draw attention to the fact that graphs which inherit that extension are all in explicit agreement concerning the meaning of the restricted vocabulary, whatever it may be. Of course, simply using a given set of IRIs should imply this agreement, but in practice this rule is sometimes unreliable, and the semantic extension mechanism provides an additional level of explicit confirmation of an intention to use a vocabulary in a strictly consistent manner.

The context graph, if provided, must be true under the semantic conditions of the extension. An extension may be completely specified by a context graph, in which case the semantic constraint is simply that the graph be true. In this case, called a graph extension, rdf:inherits has exactly the same meaning as owl:imports. This is understood to be the case when the extension is indicated by an IRI identifying an RDF graph, or a graph container, or a document which parses to a graph in some accepted RDF notation.

If the context graph of an extension A inherits the extension B . , then the extension A is called an extension of B, and includes all of the semantic constraints of B, plus (presumably) some others of its own. The reserved vocabulary of A may overlap with that of B, which indicates that A imposes further conditions on this part of the vocabulary. An extension of an extension B must not contradict any of the conditions imposed by B, but it may add further conditions which are understood to be conjoined ("and-ed") to the B conditions. (For example, if A extends B and B specifies that the IRI x:Person is the class of human beings, then A may specify that this IRI is the class of American citizens, but it may not claim that it is the class of insects.) An extension may be an extension of several other extensions.

A special case

A graph may be asserted in itself; that is, in the graph extension defined by treating it as the context graph of that extension. This means that its entire vocabulary is classified as restricted to it, and so has the effect of 'isolating' the IRIs in the graph from any meaning they might have in other graphs, allowing them to be interpreted locally in that particular graph as a unique context of meaning. We could call this a solipsist graph. It reproduces exactly the semantics of datasets as defined by Antoine. This technique may be found useful when sorting RDF content found "in the wild" into coherent groups, without being obliged to treat IRIs in separate groups as necessarily identical in meaning. This can be done using graph naming in a SPARQL dataset:

{ <u1> a rdf:Graph }

<u1> { <u1> rdf:inherits <u1> ...}

Semantics

(*** For semanticians, this is more like a 'punning' approach, different from the way the context semantics was defined. Although that was more in the spirit of the 2004 semantics, I think this is more intuitive and has better entailments. But we could go either way.***)

A semantic extension selects a subset of the interpretations of its reserved vocabulary as being those which satisfy its constraints. We model this by mappings voc from IRIs to sets of IRIs, and CON from IRIs to sets of interpretations, so that J in CON(X) means that J is an interpretation of voc(X) which satisfies the constraints of the extension indicated by the IRI X. Note that voc and CON are defined independently from any interpretation of X itself. If J is an RDF interpretation of a vocabulary including voc(X) and J^voc(X) (that is, J restricted to the vocabulary voc(X) ) is in CON(X) then we say that J conforms to the extension indicated by X. Again, note that this makes no reference to the denotation of X under J: the extension machinery is orthogonal, in this semantics, to the normal semantic reference machinery.

If X indicates an extension with a context graph G, then X > J implies that J satisfies G (but it also may have other properties, of course). If the extension is a graph extension, then X > J if and only if J satisfies G.

An RDF interperation I satisfies an RDF graph G just when I conforms to any extensions inherited by G and I satisfies G in the sense defined by the 2004 semantics.


The only way this semantics differs from that defined previously is that the machinery of indicating extensions by IRIs is here made completely independent from that which defines what resources the IRIs refer to, so equality reasoning does not apply to extensions:

A owl:sameAs B

{ <> rdf:inherits A . S P O . }

does not entail

{ <> rdf:inherits B . S P O . }

I actually think this is better, because allowing equality and class reasoning to apply to extensions could get things into an indescribable tangle where what IRIs mean depends upon obscure OWL reasoning.

Why bother?

One might ask, why bother? Since the effect of a semantic extension can be achieved by defining a vocabulary of IRIs and specifying what it is intended to mean, more or less formally, just as we do now. There are several responses to this.

First, the suggested machinery can be viewed simply as supplying some normative discipline to this existing practice, giving a standard way to refer to the defining documents for such a namespace and to allow explicit linking to the important semantic sources rather than relying on (what some may consider to be a mis-use of) the IRI de-hashing convention. It also allows for a decoupling of RDF semantic extension vocabularies from the HTTP/XML namespace conventions, since the reserved vocabulary can be any set of IRIs.

Second, this machinery can be used to go beyond the ontology-plus-namespace convention, by allowing for hierarchies of extensions to build upon one another in a series of semantic refinements, without needing to invent a whole new vocabulary. Suppose for example an ontology is published, and used, which defines a set of concepts in chemistry, but does not provide the notion of isotope. To use this ontology in a context in which isotopes – say, carbon-12 and carbon-14 – are distinguished, might require re-writing it to change the sense of "element", and this re-writing requires inventing an entire new namespace, which then needs to be related to the previous one, probably by mis-using owl:sameAs. With the current machinery, however, one could define an extension which introduces the concept of isotopes, re-defines chemical element to be the union of isotope classes, and retains the old terminology unchanged. Data can then be transferred to this new ontology simply by changing one inheritance triple in a large graph, with no internal re-writing of data required. Similar advantages accrue when ontologies are changed or updated; the old terminology can be re-used with a new semantic inheritance, even though a strict adherence to the "cool URIs" doctrine would require that the data be re-formulated using a new namespace, to track the change in meaning. (There is an existing use case. The semantic extension machinery is very similar to the 'microtheory' machinery developed by Guha and used extensively and successfully in the large-scale CYC ontology, largely in the way just outlined, to allow context-dependent refinements of meaning applied to a single vocabulary. See here for an extended description. For "context" read "extension".)

The "Cool URI" idea can be re-stated here as "Cool extensions". What the RDF Web needs in order to be stable and reliable is not stable URIs as such, but stable URIs-in-semantic-extensions; and for this to be possible, semantic extensions must be stable and reliably linked to IRIs. This will require a certain discipline to be adopted in coining and especially modifying semantic extension defining documents. Rather than changing the definitions of a semantic extensions, a new extension should be defined and linked to the older one. If possible, it should be an extension of the older version, but if this is not possible (as it often will not be) there should be some way to find the current version of any extension from any older ones. <At this point Pat is reduced to handwaving.>

examples

Time-dependent properties and intervals.

Progressive refinement a la Cyc

Mutual agreement on standard definitions eg in ISO

Links to legal texts, government documents, etc..