Tim Berners-Lee
Date: 1998, last change: $Date: 2024/02/16 13:03:01 $
Status: personal view only. Editing status: first draft.

RDF Anonymous nodes and quantification

Introduction

I belive that the way to progress from RDF to a logic langauge is to make that logic language a superset of RDF, so that every RDF statement becomes a valid expression of the logic language, but not necessarily the other way around. At the same time, of course, it will be possibe to provide an RDF description of any logic expression, so that path may be able to add some extra capcity for interchange between systems with full ogic and systems with RDF. I believe (2000/11) that the öntology"layer can be done entirely in RDF: the data can all be expressed in RDF, as can the ontologies (ontology level schemas), though the expression ofthe meaning of something like daml:transitive will be done in engligh in the specs as it would require logic to do formally.

Semantics of anonymous nodes

In the course of experimenting with RDF parsers with a view to considering the result to be

An RDF statement ( s p o ) maps to a two-parameter predicate p(s,o);
A set of RDF statements [a b c], [d e f] ... maps to an unordered set of conjoined predicates b(c,a) & e(d,f)..
An anonymous RDF node implies an implicit variable existentailly qualified in the scope of the conjunction. (in n3, "w3c:Dan :livesIn [ :inState :Texas ] " measn that there exists something such that Dan lives in it and it is in Texas.)
.

How should this be represented in the model? Well, up untill now, the node as been left in the graph as anunlabelled node, and the existential qualification has been implicit. The RDF system only considers one set of statements (at a time). In any RDF document just boils down to that set (bag) of statements.

From Triple to Quadruple

In practice, any ral RDF processor needs to dsitinguish between information from different sources. This affects how it is trusted, how it is processsed. For any statement, one must remember the context in what that statement occurred. A simple RDF document may be considered as the logical conjunction (and) of a set of statements. When you look at a logical expression as containing RDF statements, then the context becomes the part of the expression tree within a document in which the RDF occurred. So while theRDF model is of a flat set of statements, the logical model is a tree. For example, in the expression "{ w3c:peter a person } logical:or { w3c:peter a inanimateObject }" the curly braces are two contexts, each with one RDF statement.

When a triple is stored in such a machine, then, one can think more of a quadruple which incldues also a pointer back to where the statement came from. (This won't in practice be necessary when the storage technique saves the triple inside some container which represents the context - it is possible to have systems whcih do not have a pointer back to the context. But once one starts indexing statemenst across multiple contexts, it seems to become useful)

Expressing quantification

Now consider what happens when you come to extend the language to include more powerful things - say not (see for example my Toolbox thought experiments), then it is important that the scope of the quantification be explicit. ("There is a time when everyone is happy" is quite different from "For everyone, there is a time when they are happy").

I also felt it was useful to distinguish between anonymous nodes (exsistentialy quantified variables) and nodes with identifiers to which it is useful to refer. the "Dan lives in some thing that is in Texas" can indeed be written "Dan lives in some thing X, which is in Texas", if one says nothing else about X. When switching between syntaxes, or simply in reformatting an RDF document, it may become impossible to let an anonymous node remain anonymous - so generated Ids (genids) have become the norm. But in an engineering system one might be tempted to. It is also in practice much more readable to use new regenerated local identifeirs for anonymous nodes in the output of a system which has merged data from several sources. So I found I was tracking the bit which repreentedt that an Id was arbitrarily generated.

The scope of the quantification is, then, a relationship between a node in the graph and the context within which it is quantified. What more logical way to represent this than as a triple [ context forsome localObject ] that means that the localObject has been given an arbitrary Id, but the context intends to imply that the graph is valid for some object. Note the localObject is indeed the object not the string which corresponds to the object's URI.

This triple only has relevance when the RDF is viewed in a larger context - when the local context of those statements is another resource. However, the logical context of the new triple is the context it refers to - or perhaps an outer context. (@@@)

In n3 notation, wheer {} represents the resource which is the set of enclosed statements, in the statement

<#businessCard> = { w3c:dan :hasAddress [ :inState Texas . ] . }

can be represented without the anonymous node if the existenial quantification is made explicit:

<#businessCard> = {



          w3c:dan  :hasAddress <#_g1> .



          <#_g1> :inState Texas .
          <#businessCard> logic:forSome <#_g1>  }

Here, "forSome" is the relationship indicating that the context between the braces is in fact true for some object here identified by <#_g1>.

Universal quantification

The same techniqe can be used for universal quntification: forAll.

  { { :x a:includes :y } log:implies
    { :y a:partOf :x } log:forAll :x, :y .

I am not currently sure about whether the forall should be best expressed as part of the qualified expression or outside it.

Ambiguities

An obvious question araises when universal and existential quantification occur bound to the same context: what is the precedence? The folling can be read as, "This document is true for all x. There is that which loves x".

<> l:forAll :x .
[ soc:loves :x ]

In principle it could mean that forall x, there exists y such that y loves x, or that there exists y such that for all x, y loves x. The (arbitrary) choice we make is that the forAll has wide scope than the forSome.

Nested contexts

The scope of a quantifier is traditionally lexical. In the test case

<#premise> = {  <#p> daml:inverse <#q> } .
<#conclusion> = { <#q> daml:inverse <#p> } .
{ <#premise> log:implies <#conclusion> } forall <#p>, <#q> .

what is the mean, if anything? The quantified context { <#premise> log:implies <#conclusion> } does not lexically contain the variables.

If lexical nesting is to be the rule, then it must be represented in the model. Yuk, another link, you say. Indeed - another link, "lexicallyWithin".

@@@

Problems?

This is very like the mapping of quantification into RDF which DanC used earlier and found it had some hole in it. What was the problem? In a revision of that mapping grounded in KIF, I couldn't work out the use/mention issues; I'm not sure this is a first order logic -- DanC