Warning:
This wiki has been archived and is now read-only.

ISSUE-47: Can SPARQL-based constraints access the shape graph, and how?

From RDF Data Shapes Working Group
Jump to: navigation, search

This page collects use cases of ?shapesGraph access in the current SHACL spec as well as outside of the spec, and discusses design alternatives.

Page started by Holger Knublauch

Use Cases for the Core Vocabulary

sh:allowedValues

SPARQL behind sh:allowedValues currently requires walking the allowed values, which are stored in an rdf:List present in the shapes graph.

Design Alternative: SPARQL code generation with FILTER NOT IN (...). This would work OK because the allowed values cannot really be blank nodes. However, it means that we need a different mechanism here (code injection) compared to how other templates are executed.

sh:AndConstraint, sh:OrConstraint

SPARQL behind sh:AndConstraint currently requires shapes graph access for two things: to walk the rdf:List of operands, and to make the recursive sh:hasShape call for each.

Design Alternative: ?

sh:ClosedShape

SPARQL behind sh:ClosedShape currently requires walking the current shape declaration in the shapes graph, to find out which properties have been declared.

Design Alternative: SPARQL code generation with FILTER NOT IN (properties). While doable in principle, this is again a "hard-coded" custom mechanism compared to how other templates are defined.

sh:NotConstraint

SPARQL behing sh:NotConstraint currently uses sh:hasShape.

Design Alternative: ?

sh:hasShape: General Recursion and Mixing Execution Languages

The sh:hasShape function is used by several core vocabulary features as a means of calling back to the SHACL engine to evaluate a node against a given shape. This is related to ?shapesGraph access. We could in principle remove the ?shapesGraph argument and assume that the surrounding engine "knows" which shapes graph it is supposed to use (e.g. via some ThreadLocal variable trick in Java). However, if we can do these callbacks then we are also making certain assumptions that the SPARQL engine and the SHACL processor can communicate with each other. This assumption is present in Dataset-like scenarios (left hand side of my diagram) but not doable in SPARQL endpoint scenarios. As a result, it is unclear how SPARQL endpoints would handle recursion and cases in which a SPARQL query calls out to a JavaScript-based template.

Design Alternative: ?

sh:qualifiedValueShape

SPARQL behind sh:qualifiedValueShape currently uses a recursive call to sh:hasShape (nested in a helper function sh:valuesWithShapeCount) to count the number of property values that have the given shape.

Design Alternative: ?

sh:AndConstraint, sh:OrConstraint

SPARQL behind sh:AndConstraint currently requires shapes graph access for two things: to walk the rdf:List of operands, and to make the recursive sh:hasShape call for each.

Design Alternative: ?

sh:XorConstraint

SPARQL behind sh:XorConstraint currently uses recursive calls to sh:hasShape to test whether exactly one of the property values has the given shape.

Design Alternative: ?

Use Cases outside of the Core Vocabulary

General use of sh:hasShape

sh:hasShape is arguably a useful feature for all kinds of constraint templates. Given that it is used in several Core templates, it is plausible to assume that other templates will also benefit, extending the expressivity of high-level languages. It would be a poor language design if the Core Vocabulary can do significantly different things that are not also available to end users.

Template arguments that are rdf:Lists

Some Core templates take rdf:Lists as arguments. These are stored in the shapes graph. Some algorithms need to traverse those lists at run-time. Some values in those lists may be blank nodes. Any SHACL user can define templates that also take lists as arguments.

Design Alternative: ?

Variations of sh:ClosedShape

The implementation of sh:ClosedShape assumes one specific interpretation: currently all predicates mentioned in sh:properties at the class, but excluding rdf:type and sh:nodeShape. It is quite plausible to assume that not everybody will agree with this particular definition and wants to cover additional cases. Some platforms (such as the current TopBraid/Jena implementation) have no problem accessing the ?shapesGraph, so why prohibit this for everyone? At least it could be an optional feature.

Form Builders and similar algorithms

Similar to sh:ClosedShape, it will be helpful for many tools to be able to dynamically discover which properties are defined for a given shape. This includes walking the sh:property definitions, value types, cardinalities etc.

Use Cases that we haven't thought of yet

We could is easily try to convince ourselves that the SHACL WG in 2015 knows already what people in the next few years will want to do with SHACL. Sorry, but we don't. It is a perfectly normal situation in RDF-based application to have generic queries that dynamically react on whatever information they find in the class/properties model. The fact that RDFS/OWL classes are also just triples makes this an attractive value proposition. The equivalent in the SHACL world is the shapes graph.