Graphs Design 6.1/Sem

From RDF Working Group Wiki
Jump to: navigation, search

Extensions to the RDF Semantics

Dataset interpretations

Let DS = (G, (u1,G1), … , (un, Gn)) be a dataset; G is the Default Graph, and Gi, i=1,…,n be the named graphs.

Let the vocabulary for the dataset be:

V(DS) = V(G) ∪ {ui: i = 1,…,n} ∪ {rdf:hasGraph, rdf:Graph} ∪ rdfV

where V(G) is the vocabulary set of G, and rdfV is the RDF Vocabulary (as defined in the RDF Semantics document). Let I be an RDF interpretation on V(DS), such that the following conditions also hold:

  1. The range of I is a superset of {Gi : i = 1,…,n}
  2. {Gi : i = 1,…,n} ∩ LV = ∅
  3. If xV(DS) and yV(DS) and I(x) = I(y) = G, where G ∈ {Gi : i = 1,…,n}, then x = y Per Andy's remark this condition is probably an unnecessary restriction and should be taken out
  4. If <I(ui),I(rdf:Graph)> ∈ IEXT(I(rdf:type)) then I(ui) = Gi
  5. ui: ∃! xiV(DS): <I(ui),I(xi)> ∈ IEXT(I(rdf:hasGraph)) and I(xi) = Gi

Then I is an RDF-interpretation of the dataset DS. Replacing rdfV by the corresponding RDFS or OWL Vocabulary the same definition extends to these automatically. (This means a definition for the RDF Compatible Semantics of OWL; I am not sure how this can be translated to the Direct Semantics.)

A few words about each condition:

  1. Condition #1 means that the Gi can be “denoted” by terms in V(DS). It does not say what this “denoting” means, for example, it does not refer to the HTTP GET semantics (nor do I believe that can be explained within this semantics). Condition #2 stays that these graphs are not literals and literals cannot denote a graph.
  2. Condition #3 secures that this denotation is unique within the vocabulary, i.e., that the same Graph cannot be denoted by two different terms Per Andy's remark this condition is probably an unnecessary restriction and should be taken out
  3. Condition #4 defines the type for these denoting terms.
  4. Condition #5 describes what graph labeling means: that there is one and only one term in the vocabulary that denotes the corresponding Graph, and that term has the rdf:hasGraph relation to the label. Due to the restrictions of RDF this also means that ui cannot be a literal (it cannot appear as a subject for a triple).

Some consequences, and relations to the discussion on the mailing list

Unicity of the denotation (i.e., can the same graph be denoted by two different URIs?)

Condition #5 ensures that this unicity holds for a specific dataset.

In a more general sense: it is possible to talk about the union of two datasets:

DS1 = (G, (u1,G1), … , (un, Gn))
DS2 = (H, (v1,H1), … , (vk, Hk))

then, roughly:

DS1DS2 = ( GH, (u1,G1), … , (un, Gn), (v1,H1), … , (vk, Hk))

(the exact mathematical formula should make sure that if ui = vj and Gi = Hj, then the corresponding pair appears only once in the dataset.)

Coming back to the issue of unicity: if it so happens that we have:

  1. ∃i: i=1,…,n and ∃j: i=1,…,k such that ui = vj, and
  2. (ui, rdf:type, rdf:Graph) and (vj, rdf:type, rdf:Graph) both hold, and
  3. GiHj

then DS1DS2 is inconsistent (i.e., there can be no valid interpretation). Which is as far as we can go in terms of semantics.

Note that the second condition above is essential to establish inconsistency, insofar as this means there may be several labels to the same Graph. If we want to reinforce the unicity of that, too, this should be added to the interpretation; however, the difference between labels and denoting terms become insignificant then…

Subgraphs or Graphs

This issue came up on the mailing list; in case of:

<u> { ... }

Does the curly brackets denote the graph labeled by <u> or a subgraph thereof. The semantics is pretty much in terms of identity and it is not clear how the “subgraph” feature could be added to it. That being said, this issue might be a serialization and not a semantics one. After all, it is perfectly feasible to have

<u> { ... }
{ ... }
<u> { ... }

But, conceptually, <u> labels the union of the two set of terms in curly brackets to be in the same graph.

Graphs are quoted

The consequence of the definition is that the Gi are really “quoted” here. I.e., for example:

{ <b> rdf:type owl:FunctionalProperty . }
<u> { <a> <b> <c>. 
      <a> <b> <d>. 
      <c> owl:differentFrom <d> .}

is not inconsistent, because the interpretation on the default graph is not carried over to the named graphs themselves.

Shared blank nodes

This is an open issue, not clear at this moment how to fold that into the semantics: the Working Groups has said, up to now, that blank nodes, among different Graphs within the same dataset, are shared. Not clear how to fold that into the semantics.

Actually… it is worth noting that there is nothing in the various semantics conditions that make use of the fact that the Gi-s are graphs. They could be anything… Not clear whether this is a bug or a feature!