Warning:
This wiki has been archived and is now read-only.

Shacl-sparql-presentation

From RDF Data Shapes Working Group
Jump to: navigation, search

A SHACL Specification based on SPARQL, Presentation

A SHACL based on SPARQL

Basis of the Specification

  • Constraint language, in the style of OWL Constraints, Stardog ICV, SPIN, RDF Unit
  • Uses a simple translation to a single SPARQL query per constraint
    • Constraint violations are the result of the query when run on a standard SPARQL engine under the RDFS entailment regime
    • RDFS vocabulary is interpreted as in RDFS
    • Specification is very simple, and standard
  • There is nothing special about a shape or constraint that is also a class
  • Validation takes two arguments
    • 1/ Constraint graph 2/ Data and ontology graph (or dataset)
      • Could be the same graph, but no special treatment if so
    • No access to the web at large, except via directly using SPARQL
  • The RDF encoding hides the details of SPARQL
    • A profile allowing only certain RDF-based constraints does not need a full SPARQL engine for implementation
    • Adding a new RDF-based shape construct is simply a matter of determining its translation to SPARQL, no extra work is needed

Simple Example of a constraint

  • the offspring of people are all people
ex:Person <= all ex:offspring ex:Person (OWL Constraints, using a publication language for DLs)

Data Graph:

 ex:Student rdfs:subClassOf ex:Person .
 ex:John rdf:type ex:Person .
 ex:Mary rdf:type ex:Student .
 ex:John ex:offspring ex:Mary .
 ex:Mary ex:offspring ex:Susan .

Constraint Graph:

 [ rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:classScope ex:Person ;
   sh:shape [ rdf:type sh:Shape ;
   	     sh:predicate ex:offspring ;
  	     sh:valueType ex:Person ]
 ]

Results:

  • John is OK because his only stated offspring is stated to be a student and thus a person
  • Mary is a violation because Susan is not stated to be a person
  • Susan is irrelevant because she is not stated to be a person

Simplified View of the Proposal

  • A constraint graph contains constraints
  • A constraint has both a scope and a shape
  • Basic operation is validating a constraint graph against a data graph
    • Validate each constraint in the constraint graph against the data graph
  • Validating a constraint against a data graph is essentially verifying that every node that belongs to the scope also belongs to the shape

Scope of a constraint is either

  • A single node
  • Instances of a class
  • Nodes that belong to an RDF-encoded shape
  • Nodes that "satisfy" a SPARQL expression

Shape of a constraint, is either

  • Nodes that belong to an RDF-encoded shape
  • Nodes that "satisfy" a SPARQL expression

RDF encoding for shapes

  • Similar to that in https://w3c.github.io/data-shapes/shacl/
  • Does not add any expressive power
    • Simply a way of insulating from complexities of SPARQL
  • Does not cover all of SPARQL
  • Currently includes
    • conjunction, disjunction
    • individual, class instances
    • cardinality, value, allowed values, node type, value type, value shape
  • Adding other constructs is simple as long as they have a SPARQL translation

Examples

UC1: The model is broken

  • classes and properties in an ontology graph have to conform to a particular (but perhaps stupid) ontology design philosophy
{ rdfs:Class } <= <=0 rdfs:subClassOf
rdfs:Class <= {rdfs:Class} | ( =1 rdfs:subClassOf && all rdfs:subClassOf rdfs:Class )
no non-trivial rdfs:subClassOf loops
rdf:Property <= =1 rdfs:domain & all rdfs:domain rdfs:Class & =1 rdfs:range & all rdfs:range rdfs:Class
 [ rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:individualScope rdfs:Class ;
   sh:shape [ rdf:type sh:Shape ;
   	     sh:predicate rdfs:subClassOf ;
  	     sh:cardinality 0 ]
 ]
 [ rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:classScope rdfs:Class ;
   sh:shape 
     [ sh:or ( [ rdf:type sh:Shape ;
       	      	sh:individual rdfs:Class ] 
      	      [ rdf:type sh:Shape ;
  	      	sh:predicate rdfs:subClassOf ;
  		sh:valueType rdfs:Class ;
 		sh:cardinality 1 ] ) ]
 ]
 [ rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:classScope rdf:Property ;
   sh:shape 
     [ sh:and ( [ rdf:type sh:Shape ;
	      	 sh:predicate rdfs:domain ;
		 sh:valueType rdfs:Class ;
		 sh:cardinality 1 ]
     	       [ rdf:type sh:Shape ;
	      	 sh:predicate rdfs:range ;
		 sh:valueType rdfs:Class ;
		 sh:cardinality 1 ] ) ] 
 ]
 [ rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:constraint 
     """SELECT ?this WHERE 
 	{ ?this rdfs:subClassOf ?that .
 	  FILTER ( ?this != ?that )	  
 	  ?that rdfs:subClassOf ?this . }""" 
 ]

UC2: Enforcing cardinality

  • every person has to have one or more names, and each name is a string
ex:Person <= >=1 ex:name & all ex:name xsd:string
 [ rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:classScope ex:Person ;
   sh:shape [ rdf:type sh:Shape ;
   	     sh:predicate ex:name ;
  	     sh:valueType xsd:string ;
  	     sh:minCardinality 1 ] 
 ]

UC4: Variations on a shape

  • Different constraints are active for a node depending on data associated with the node
ex:issue <= all ex:status { ex:reported, ex:verified }
ex:issue & ex:status : reported <= >=1 ex:reportedBy
ex:issue & ex:status : verified <= >=2 ex:reportedBy
 [ rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:classScope ex:issue
   sh:shape [ rdf:type sh:Shape ;
   	     sh:predicate ex:status ;
  	     sh:allowedValues ( ex:reported ex:verified ) ]
 ]
 [ rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:shapeScope [ sh:and ( [ sh:type ex:issue ]
   		    	     [ rdf:type sh:Shape ;
			       sh:predicate ex:status ;
			       sh:hasValue ex:reported ] ) ]
   sh:shape [ rdf:type sh:Shape ;
   	       sh:predicate ex:status ;
	       sh:minCardinality 1 ] ]
 [ rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:shapeScope [ sh:and ( [ sh:type ex:issue ]
   		    	     [ rdf:type sh:Shape ;
			       sh:predicate ex:status ;
			       sh:hasValue ex:verified ] ) ]
   sh:shape [ rdf:type sh:Shape ;
   	       sh:predicate ex:status ;
	       sh:minCardinality 2 ] ]

UC9: Contract time intervals

  • Contracts have a time interval where they are valid
  • For bonds, this time interval must have a single end date
ex:Contract <= =1 ex:valid ex:TimeInterval
ex:Bond <= all ex:valid ( =1 ex:endTime xsd:date )

RDFS ontology:

ex:Bond rdfs:subClassOf ex:Contract .
ex:valid rdfs:domain ex:Contract .
ex:valid rdfs:range ex:TimeInterval .
ex:endTime rdfs:domain ex:TimeInterval .
ex:endTime rdfs:range xsd:date .

Constraints:

 [ rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:classScope ex:Contract ;
   sh:shape [ rdf:type sh:Shape ;
   	       sh:predicate ex:valid ;
	       sh:cardinality 1 ;
	       sh:valueType ex:TimeInterval ]
 ]
 [ rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:classScope ex:Bond ;
   sh:shape [ rdf:type sh:Shape ;
   	       sh:predicate ex:valid ;
	       sh:cardinality 1 ;
	       sh:valueShape [ rdf:type sh:Shape ;
 		    	       sh:predicate ex:endTime ;
			       sh:cardinality 1 ]
 	]
 ]

UC23: schema.org constraints

  • Validation of schema.org data against some particular constraints, e.g.
    • On schema:children, ...: Irreflexitity
    • On schema:Person: Children must be born after the parent, deathDate must be after birthDate
 [ rdf:type sh:Constraint ;
   sh:severity sh:warning ;
   sh:constraint 
     """SELECT ?this WHERE 
 	{ ?this schema:child+ ?this }"""
 ]
 [ rdf:type sh:Constraint ;
   sh:severity sh:warning ;
   sh:classScope ex:Person ;
   sh:sparqlShape """?this schema:birthDate ?bdate .
   		   ?this schema:child ?child .
   		   ?child schema:birthDate ?cbdate .
 		   FILTER ( ?bdate < ?cbdate )"""
 ]
 [ rdf:type sh:Constraint ;
   sh:severity sh:warning ;
   sh:classScope ex:Person ;
   sh:sparqlShape """?this schema:birthDate ?bdate .
 		?this schema:deathDate ?ddate .
  		FILTER ( ?bdate < ?ddate )"""
 ]

UC33: validate medical procedure

  • Check that a medical observation has only one outcome
bridg:PerformedObservation <= =1 :result
 [ rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:classScope bridg:PerformedObservation;
   sh:shape [ rdf:type sh:Shape ;
   	       sh:predicate :result ;
 	       sh:cardinality 1 ]
 ]


Bugs from Primer

This is similar to the bug example in the Primer.

 my:IssueShape rdf:type sh:Shape ;
   sh:shape [ sh:and ( [ rdf:type sh:Shape ;
   	     	      	sh:predicate ex:state ;
  			sh:allowedValues (ex:unassigned ex:assigned) ;
  			sh:cardinality 1 ]
  		      [ rdf:type sh:Shape ;
   	     	      	sh:predicate ex:reportedBy ;
 			sh:valueShape my:UserShape ;
  			sh:cardinality 1 ] ) ] .
 
 my:UserShape rdf:type sh:Shape ;
   sh:shape [ sh:and ( [ rdf:type sh:Shape ;
   	     	        sh:predicate foaf:name ;                   
  			sh:valueType xsd:string ;                  
  			sh:cardinality 1 ]
 		      [ rdf:type sh:Shape ;
  		      	sh:predicate foaf:mbox ;                   
  			sh:nodeKind sh:IRI ;                      
  			sh:minCount 1 ] ) ] .
 
 my:IssueConstraint rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:shapeScope [ rdf:type sh:Shape ;
   		  sh:predicate ex:reportedBy ;
 		  sh:minCardinality 1 ] ;
   sh:shape my:IssueShape .

Choices from Primer

 my:UserShape a sh:Shape ;
  sh:and ( [
   sh:or ( [ sh:and ( [ rdf:type sh:Shape ;
   	    	       sh:predicate foaf:name;
 		       sh:cardinality 0 ]
 		     [ rdf:type sh:Shape ;
 		       sh:predicate foaf:givenName ; 
 		       sh:valueType xsd:string ;
 		       sh:minCardinality 1 ] 
 		     [ rdf:type sh:Shape ;
 		       sh:predicate foaf:familyName ; 
 		       sh:valueType xsd:string ;
 		       sh:cardinality 1 ] ) ]
 	  [ sh:and ( [ rdf:type sh:Shape ;
   	    	       sh:predicate foaf:givenName;
 		       sh:cardinality 0 ]
 		     [ rdf:type sh:Shape ;
   	    	       sh:predicate foaf:familyName;
 		       sh:cardinality 0 ]
 		     [ rdf:type sh:Shape ;
 		       sh:predicate foaf:name ; 
 		       sh:valueType xsd:string ;
 		       sh:cardinality 1 ] ) ] )
  ] [ sh:predicate foaf:name ;             
      sh:valueType xsd:string ;            
      sh:cardinality 1 
  ]
 
 my:PersonConstraint rdf:type sh:Constraint ;
   sh:severity sh:fatalError ;
   sh:classScope foaf:Person ;
   sh:shape my:UserShape .

Limitations and Problems

  • This is a SPARQL-only solution
    • No recursive shapes
    • No closed shapes construct per se
      • SPARQL can be used to implement most (all??) closed shapes
  • Only the guts of SHACL have been specified
    • No nice string-based reporting of violations
    • Only three kinds of violations
    • No templates (but they could be added)
  • The RDF syntax could be improved
  • This is a paper-only proposal.
    • No implementation
    • My company will not be building an implementation.