Warning:
This wiki has been archived and is now read-only.
Shacl-sparql-presentation
From RDF Data Shapes Working Group
Contents
A SHACL Specification based on SPARQL, Presentation
Basis of the Specification
- Constraint language, in the style of OWL Constraints, Stardog ICV, SPIN, RDF Unit
- Uses a simple translation to a single SPARQL query per constraint
- Constraint violations are the result of the query when run on a standard SPARQL engine under the RDFS entailment regime
- RDFS vocabulary is interpreted as in RDFS
- Specification is very simple, and standard
- There is nothing special about a shape or constraint that is also a class
- Validation takes two arguments
- 1/ Constraint graph 2/ Data and ontology graph (or dataset)
- Could be the same graph, but no special treatment if so
- No access to the web at large, except via directly using SPARQL
- 1/ Constraint graph 2/ Data and ontology graph (or dataset)
- The RDF encoding hides the details of SPARQL
- A profile allowing only certain RDF-based constraints does not need a full SPARQL engine for implementation
- Adding a new RDF-based shape construct is simply a matter of determining its translation to SPARQL, no extra work is needed
Simple Example of a constraint
- the offspring of people are all people
- ex:Person <= all ex:offspring ex:Person (OWL Constraints, using a publication language for DLs)
Data Graph:
ex:Student rdfs:subClassOf ex:Person . ex:John rdf:type ex:Person . ex:Mary rdf:type ex:Student . ex:John ex:offspring ex:Mary . ex:Mary ex:offspring ex:Susan .
Constraint Graph:
[ rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:classScope ex:Person ; sh:shape [ rdf:type sh:Shape ; sh:predicate ex:offspring ; sh:valueType ex:Person ] ]
Results:
- John is OK because his only stated offspring is stated to be a student and thus a person
- Mary is a violation because Susan is not stated to be a person
- Susan is irrelevant because she is not stated to be a person
Simplified View of the Proposal
- A constraint graph contains constraints
- A constraint has both a scope and a shape
- Basic operation is validating a constraint graph against a data graph
- Validate each constraint in the constraint graph against the data graph
- Validating a constraint against a data graph is essentially verifying that every node that belongs to the scope also belongs to the shape
Scope of a constraint is either
- A single node
- Instances of a class
- Nodes that belong to an RDF-encoded shape
- Nodes that "satisfy" a SPARQL expression
Shape of a constraint, is either
- Nodes that belong to an RDF-encoded shape
- Nodes that "satisfy" a SPARQL expression
RDF encoding for shapes
- Similar to that in https://w3c.github.io/data-shapes/shacl/
- Does not add any expressive power
- Simply a way of insulating from complexities of SPARQL
- Does not cover all of SPARQL
- Currently includes
- conjunction, disjunction
- individual, class instances
- cardinality, value, allowed values, node type, value type, value shape
- Adding other constructs is simple as long as they have a SPARQL translation
Examples
UC1: The model is broken
- classes and properties in an ontology graph have to conform to a particular (but perhaps stupid) ontology design philosophy
- { rdfs:Class } <= <=0 rdfs:subClassOf
- rdfs:Class <= {rdfs:Class} | ( =1 rdfs:subClassOf && all rdfs:subClassOf rdfs:Class )
- no non-trivial rdfs:subClassOf loops
- rdf:Property <= =1 rdfs:domain & all rdfs:domain rdfs:Class & =1 rdfs:range & all rdfs:range rdfs:Class
[ rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:individualScope rdfs:Class ; sh:shape [ rdf:type sh:Shape ; sh:predicate rdfs:subClassOf ; sh:cardinality 0 ] ]
[ rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:classScope rdfs:Class ; sh:shape [ sh:or ( [ rdf:type sh:Shape ; sh:individual rdfs:Class ] [ rdf:type sh:Shape ; sh:predicate rdfs:subClassOf ; sh:valueType rdfs:Class ; sh:cardinality 1 ] ) ] ]
[ rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:classScope rdf:Property ; sh:shape [ sh:and ( [ rdf:type sh:Shape ; sh:predicate rdfs:domain ; sh:valueType rdfs:Class ; sh:cardinality 1 ] [ rdf:type sh:Shape ; sh:predicate rdfs:range ; sh:valueType rdfs:Class ; sh:cardinality 1 ] ) ] ]
[ rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:constraint """SELECT ?this WHERE { ?this rdfs:subClassOf ?that . FILTER ( ?this != ?that ) ?that rdfs:subClassOf ?this . }""" ]
UC2: Enforcing cardinality
- every person has to have one or more names, and each name is a string
- ex:Person <= >=1 ex:name & all ex:name xsd:string
[ rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:classScope ex:Person ; sh:shape [ rdf:type sh:Shape ; sh:predicate ex:name ; sh:valueType xsd:string ; sh:minCardinality 1 ] ]
UC4: Variations on a shape
- Different constraints are active for a node depending on data associated with the node
- ex:issue <= all ex:status { ex:reported, ex:verified }
- ex:issue & ex:status : reported <= >=1 ex:reportedBy
- ex:issue & ex:status : verified <= >=2 ex:reportedBy
[ rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:classScope ex:issue sh:shape [ rdf:type sh:Shape ; sh:predicate ex:status ; sh:allowedValues ( ex:reported ex:verified ) ] ]
[ rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:shapeScope [ sh:and ( [ sh:type ex:issue ] [ rdf:type sh:Shape ; sh:predicate ex:status ; sh:hasValue ex:reported ] ) ] sh:shape [ rdf:type sh:Shape ; sh:predicate ex:status ; sh:minCardinality 1 ] ]
[ rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:shapeScope [ sh:and ( [ sh:type ex:issue ] [ rdf:type sh:Shape ; sh:predicate ex:status ; sh:hasValue ex:verified ] ) ] sh:shape [ rdf:type sh:Shape ; sh:predicate ex:status ; sh:minCardinality 2 ] ]
UC9: Contract time intervals
- Contracts have a time interval where they are valid
- For bonds, this time interval must have a single end date
- ex:Contract <= =1 ex:valid ex:TimeInterval
- ex:Bond <= all ex:valid ( =1 ex:endTime xsd:date )
RDFS ontology:
ex:Bond rdfs:subClassOf ex:Contract . ex:valid rdfs:domain ex:Contract . ex:valid rdfs:range ex:TimeInterval . ex:endTime rdfs:domain ex:TimeInterval . ex:endTime rdfs:range xsd:date .
Constraints:
[ rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:classScope ex:Contract ; sh:shape [ rdf:type sh:Shape ; sh:predicate ex:valid ; sh:cardinality 1 ; sh:valueType ex:TimeInterval ] ]
[ rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:classScope ex:Bond ; sh:shape [ rdf:type sh:Shape ; sh:predicate ex:valid ; sh:cardinality 1 ; sh:valueShape [ rdf:type sh:Shape ; sh:predicate ex:endTime ; sh:cardinality 1 ] ] ]
UC23: schema.org constraints
- Validation of schema.org data against some particular constraints, e.g.
- On schema:children, ...: Irreflexitity
- On schema:Person: Children must be born after the parent, deathDate must be after birthDate
[ rdf:type sh:Constraint ; sh:severity sh:warning ; sh:constraint """SELECT ?this WHERE { ?this schema:child+ ?this }""" ]
[ rdf:type sh:Constraint ; sh:severity sh:warning ; sh:classScope ex:Person ; sh:sparqlShape """?this schema:birthDate ?bdate . ?this schema:child ?child . ?child schema:birthDate ?cbdate . FILTER ( ?bdate < ?cbdate )""" ]
[ rdf:type sh:Constraint ; sh:severity sh:warning ; sh:classScope ex:Person ; sh:sparqlShape """?this schema:birthDate ?bdate . ?this schema:deathDate ?ddate . FILTER ( ?bdate < ?ddate )""" ]
UC33: validate medical procedure
- Check that a medical observation has only one outcome
- bridg:PerformedObservation <= =1 :result
[ rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:classScope bridg:PerformedObservation; sh:shape [ rdf:type sh:Shape ; sh:predicate :result ; sh:cardinality 1 ] ]
Bugs from Primer
This is similar to the bug example in the Primer.
my:IssueShape rdf:type sh:Shape ; sh:shape [ sh:and ( [ rdf:type sh:Shape ; sh:predicate ex:state ; sh:allowedValues (ex:unassigned ex:assigned) ; sh:cardinality 1 ] [ rdf:type sh:Shape ; sh:predicate ex:reportedBy ; sh:valueShape my:UserShape ; sh:cardinality 1 ] ) ] . my:UserShape rdf:type sh:Shape ; sh:shape [ sh:and ( [ rdf:type sh:Shape ; sh:predicate foaf:name ; sh:valueType xsd:string ; sh:cardinality 1 ] [ rdf:type sh:Shape ; sh:predicate foaf:mbox ; sh:nodeKind sh:IRI ; sh:minCount 1 ] ) ] . my:IssueConstraint rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:shapeScope [ rdf:type sh:Shape ; sh:predicate ex:reportedBy ; sh:minCardinality 1 ] ; sh:shape my:IssueShape .
Choices from Primer
my:UserShape a sh:Shape ; sh:and ( [ sh:or ( [ sh:and ( [ rdf:type sh:Shape ; sh:predicate foaf:name; sh:cardinality 0 ] [ rdf:type sh:Shape ; sh:predicate foaf:givenName ; sh:valueType xsd:string ; sh:minCardinality 1 ] [ rdf:type sh:Shape ; sh:predicate foaf:familyName ; sh:valueType xsd:string ; sh:cardinality 1 ] ) ] [ sh:and ( [ rdf:type sh:Shape ; sh:predicate foaf:givenName; sh:cardinality 0 ] [ rdf:type sh:Shape ; sh:predicate foaf:familyName; sh:cardinality 0 ] [ rdf:type sh:Shape ; sh:predicate foaf:name ; sh:valueType xsd:string ; sh:cardinality 1 ] ) ] ) ] [ sh:predicate foaf:name ; sh:valueType xsd:string ; sh:cardinality 1 ] my:PersonConstraint rdf:type sh:Constraint ; sh:severity sh:fatalError ; sh:classScope foaf:Person ; sh:shape my:UserShape .
Limitations and Problems
- This is a SPARQL-only solution
- No recursive shapes
- No closed shapes construct per se
- SPARQL can be used to implement most (all??) closed shapes
- Only the guts of SHACL have been specified
- No nice string-based reporting of violations
- Only three kinds of violations
- No templates (but they could be added)
- The RDF syntax could be improved
- This is a paper-only proposal.
- No implementation
- My company will not be building an implementation.