GRAPH constraints

Currently there is not support for graphs in SHEX therefore is this a first proposal for supporting graphs in SHEX.

The following proposal exist of the following 4 definitions

Graph definition
Valuetype that reference to graph
Reference to shape within another graph
Triple stored within another graph
Use Graph as Shape

First we define some syntax to define shape expressions for each/a graph in the database using the following proposed syntax

ex:GraphUsers [[ #Shape expression definition for the graph itself
  ex:UserShape {
   foaf:name xsd:string .
 }
]]

ex:GraphReport [[ 
  ex:ReportShape {
   ex:title xsd:string+ .
 }
]]

Using the following syntax we can add support for referencing to a graph from a subject

ex:GraphUsers [[
  ex:UserShape {
   foaf:name xsd:string .
   ex:reportSet []ex:GraphReport . #reference to the GraphReport Graph
 }
]]

ex:GraphReport [[ 
  ex:ReportShape {
   ex:title xsd:string+ .
 }
]]

Example data

ex:allusers {
  ex:user1 foaf:name "user1" ;
           ex:reportSet ex:reportSet1 .
  ex:user2 foaf:name "user2" ;
           ex:reportSet ex:reprotSet2 .
}

ex:reportSet1 {
  ex:report1 ex:title "report1" .
  ex:report2 ex:title "report2" .
}

ex:reportSet2 {
  ex:report3 ex:title "report3" .
  ex:report4 ex:title "report4" .
}

Using the following syntax we can reference to a shape definition inside another graph.

NOTE: The 'default shape graph' is the 'current shape graph'

ex:GraphUsers [[ 
  ex:UserShape {
   foaf:name xsd:string .
   ex:report [@ReportShape+ -> []ex:GraphReport] . #reference to shape ReportShape in the graph GraphReport
 }
]]

ex:GraphReport [[
  ex:ReportShape {
   ex:title xsd:string .
 }
]]

Example data

ex:allusers {
  ex:user1 foaf:name "user1" ;
           ex:report ex:report1, ex:report2.
  ex:user2 foaf:name "user2" ;
           ex:report ex:report3, ex:report4.
}

ex:reportSet1 {
  ex:report1 ex:title "report1" .
  ex:report2 ex:title "report2" .
  ex:report3 ex:title "report3" .
  ex:report4 ex:title "report4" .
}

Using the following syntax we can define that for a certain arc the triple are stored in another graph

NOTE: the 'default graph' for an arc is the 'current graph'

ex:GraphUsers [[
  ex:UserShape {
   foaf:name xsd:string .
   [ex:age xsd:integer] -> []ex:AgeGraph . #this triple is stored within the specified graph
 }
]]

ex:AgeGraph [[ 
   ex:AgeShape { #extra/double validition inside the AgeGraph itself
     ex:age xsd:integer 
   }
]]

Example data

ex:allusers {
  ex:user1 foaf:name "user1" .
  ex:user2 foaf:name "user2" .
}

ex:ageGraph {
  ex:user1 ex:age 24.
  ex:user1 ex:age 28.
}

Using the following syntax we can use the graph subject to say something about the graph itself

ex:GraphUsers [[
  ex:UserShape {
   foaf:name xsd:string .
   [ex:age xsd:integer] -> []ex:AgeGraph . #this triple is stored within the specified graph
 }
 []ex:AgeGraph { #some extra information on the age graph
   ex:source xsd:string .
 }
]]

ex:allusers {
  ex:user1 foaf:name "user1" .
  ex:user2 foaf:name "user2" .
  ex:ageGraph ex:source "world wide web" .
}

ex:ageGraph {
  ex:user1 ex:age 24.
  ex:user1 ex:age 28.
}

When combining several of these items we can define the following

ex:GraphUsers [[ 
  ex:UserShape {
   foaf:name xsd:string .
   [ex:report [@ReportShape+ -> []ex:GraphReport]] -> []ex:GraphReportLink . #reference to shape ReportShape in the graph GraphReport and triple is stored in the GraphReportLink graph shape
   }
]]

ex:GraphReportLink [[
  ex:ReportLink { #double validation
    ex:report [@ReportShape+ -> []ex:GraphReport]]     
  }
]]

ex:GraphReport [[
  ex:ReportShape {
   ex:title xsd:string .
 }
]]

example data:

ex:allusers {
  ex:user1 foaf:name "user1" .
  ex:user2 foaf:name "user2" .
}
  
ex:reportLink {
  ex:user1 ex:report ex:report1, ex:report2.
  ex:user2 ex:report ex:report3, ex:report4.
}

ex:reportSet1 {
  ex:report1 ex:title "report1" .
  ex:report2 ex:title "report2" .
}
ex:reportSet2 {
  ex:report3 ex:title "report3" .
  ex:report4 ex:title "report4" .
}

problems

This initial proposal can do quiet a lot however there are several problems

No method exist that defines the graph and shape subject to be the same

In the follwoing use case we would like to tell that graph and subject should be the same, however with the initial proposal given here is that not possible

ex:interaction123 { 
  ex:interaction123 :upregulates ex:protein456
}
ex:protein456 { 
  ex:protein456 ex:name "lexa"
}

Schema definition:

ex:Interaction [[
  []ex:Interaction {
    ex:protein []ex:Protein, #reference to graph
    ex:protein [@[]ex:Protein -> ex:Protein] #as well reference to the shape within 
  }
]]

ex:Protein [[
  []ex:Protein { #could be IRI of an other protein graph
    ex:name xsd:string
  }
]]

An option to solve this would be using some kind of variable binding as done in sparql, however this could dramatically increase the expressiveness and the complexity of the validation process. This would look something like.

ex:Interaction [[
  []ex:Interaction {
    ex:protein []ex:Protein, #reference to graph
    ex:protein [@[]ex:Protein -> ex:Protein] #as well reference to the shape within 
  }
]]

ex:Protein ?uri [[ #bind uri of the graph
  []ex:Protein ?uri { #shape uri should be the same of the one of the graph
    ex:name xsd:string
  }
]]

However the use of bindable variables in SHEX would be a discussion on it own.

Defining the (reverse)multiplicity between a subject and the graph is impossible

For the following definition both solution are ok, there is not method to say something about the multiplicity between a graph and subject. ex:GraphUsers [[

  ex:UserShape {
   foaf:name xsd:string .
   [ex:report [@ReportShape+ -> []ex:GraphReport]] -> []ex:GraphReportLink . #reference to shape ReportShape in the graph GraphReport and triple is stored in the GraphReportLink graph shape
   }
]]

ex:GraphReportLink [[
  ex:ReportLink { #double validation
    ex:report [@ReportShape+ -> []ex:GraphReport]]     
  }
]]

ex:GraphReport [[
  ex:ReportShape {
   ex:title xsd:string .
 }
]]

example data:

ex:allusers {
  ex:user1 foaf:name "user1" .
  ex:user2 foaf:name "user2" .
}
  
ex:reportLink {
  ex:user1 ex:report ex:report1, ex:report2.
  ex:user2 ex:report ex:report3, ex:report4.
}

ex:reportSet1 {
  ex:report1 ex:title "report1" .
  ex:report2 ex:title "report2" .
}
ex:reportSet2 {
  ex:report3 ex:title "report3" .
  ex:report4 ex:title "report4" .
}

however, this will fit also, which could be unwanted

ex:reportSet1 {
  ex:report1 ex:title "report1" .
  ex:report2 ex:title "report2" .
  ex:report3 ex:title "report3" .
  ex:report4 ex:title "report4" .
}

We could solve this by defining some kind of multiplicity behind the -> sign, which would look something like ex:GraphUsers [[

  ex:UserShape {
   foaf:name xsd:string .
   [ex:report [@ReportShape+ ->1 []ex:GraphReport]] ->1:1 []ex:GraphReportLink . #all ex:report triples to be found in one graph that graph might not contain any other triples matching this arc and subject
   }
]]

ex:GraphReportLink [[
  ex:ReportLink { #double validation
    ex:report [@ReportShape+ ->1:1 []ex:GraphReport]]  #all definitions to found in one graph and that graph may only contain references from this arc and subject
  }
]]

ex:GraphReport [[
  ex:ReportShape {
   ex:title xsd:string .
 }
]]

However the exact details, complexity and related problems are not clear at the moment.

Has complex effect on the validation process