- 1 Peter's Email 2016-08-16
- 1.1 pre-binding
- 1.2 syntax of SPARQL variables
- 1.3 pre-binding optional?
- 1.4 $PATH vs other $-prefixed variables
- 1.5 $value
- 1.6 aggregation
- 1.7 ASK validators syntax
- 1.8 different levels of SHACL implementation
- 1.9 order of processing for filters
- 1.10 $shapesGraph
- 1.11 circular filters
- 1.12 EXISTS and blank nodes
- 1.13 union operations on data graphs and shapes graphs
- 1.14 $targetNode
- 1.15 MAY
- 1.16 MAY 2
- 1.17 Graphs SHOULD
- 1.18 Suggestions
- 1.19 Different shapes graph
- 1.20 Deep copy
- 1.21 Filter role
- 1.22 Incomplete table
Peter's Email 2016-08-16
Here are a few of the problems with the current public working draft I found during a quick scan of it.
SPARQL does not evaluate variables that occur in basic graph patterns. This means that the definition of pre-binding has unusual behaviour. For example, the normative SPARQL definition of sh:class will return validation results for every pair of nodes in the graph such that there is an rdf:type/rdfs:subClass* path from the first to the second.
This problem affects many parts of the definition of SHACL. It means that the normative definition of many SHACL constructs is counter to intuitions. This problem is not ameliorated by the caution box in Appendix B.
- Comment (HK): WG is waiting for input from the SPARQL EXISTS CG on this topic.
syntax of SPARQL variables
SPARQL treats $ and ? as equivalent so $PATH and ?PATH both refer to the PATH variable. SHACL uses $ as a special marker and includes $ and ? as part of the variable.
Would ?PATH be substituted as $PATH is? If a SPARQL query for a SHACL constraint only used ?this would the variable this be pre-bound?
- Comment (HK): I have tried to address this here (https://github.com/w3c/data-shapes/commit/4871ced946aa03cd2bd91d808d8e4a1b33e64ef6) so that the text no longer refers to things like $PATH as a variable, but instead to PATH.
"SPARQL variables using the $ marker represent external values that must be pre-bound or substituted in the SPARQL query before execution." "When SPARQL constraints are executed, the validation engine should pre-bind values for these variables." Are some $-marked variables not necessarily pre-bound, counter to the earlier requirement?
- Comment (HK): The "should" was indeed a mistake, it's not optional. Removed: https://github.com/w3c/data-shapes/commit/ecdad602d5d4bfeb3a2a876298349fe69d0c4e60
$PATH vs other $-prefixed variables
The variable PATH is treated specially in SHACL. However, the general description of $ does not specially call out PATH: "SPARQL variables using the $ marker represent external values that must be pre-bound or substituted in the SPARQL query before execution."
- Comment (HK): Addressed here, pointing out the special treatment of PATH: https://github.com/w3c/data-shapes/commit/a5db1204433b19a0da099a8a89af76186d865f6c
$value is used in many ASK queries. However the definition of ASK validators does not appear to pre-bind value.
- Comment (HK): 4.1 states "These queries are interpreted against each value node, bound to the variable value." A similar statement exists in section 6.4.2. So I am not sure what is missing here.
The prohibition "Furthermore, any query that uses the variable $this in an aggregation is invalid." is vague. It appears to disallow the use of $this in any part of the SPARQL 1.1 aggregation machinery, as the pointer in the sentence is to Section 11 of the SPARQL specification. This would rule out all of the examples of aggregation in the SHACL document.
- Comment (HK): I have tried to clarify that this is only about the use of ?this in expressions. This is allowing its use in GROUP BY, in case you were referring to this. Apart from that I don't see uses of ?this in aggregations in the SHACL document. https://github.com/w3c/data-shapes/commit/0c6939ba95ffd6c7fee2285a3638c144a97f8528
ASK validators syntax
The syntax for ASK queries in SPARQL 1.1 is
"ASK" DatasetClause* WhereClause SolutionModifier
The syntax for WhereClause is
The syntax for EXISTS constructs SPARQL 1.1 is
Stripping the ASK from the beginning of an ASK query does not generally end up with a GroupGraphPattern that can be used as the argument for EXISTS.
- Comment (HK): Thanks for pointing out this detail. I have tried to address this with: https://github.com/w3c/data-shapes/commit/d820e0bac287944fb13edc86040995927f02e20d
It appears that the values of sh:ask are never used as ASK queries by SHACL processors. Why then are these of the form of ASK queries?
- Comment (HK): While in theory we could have stated GroupGraphPattern, I think ASK is more intuitive to explain and allows stand-alone execution with copy and paste. Furthermore they align with the use of functions, which can also have ASK queries as their bodies.
different levels of SHACL implementation
There are several different kinds of SHACL implementations that are hinted at in the document.
"SHACL implementations may, but are not required to, support entailment regimes." "Access to the shapes graph is not a requirement for supporting the SHACL Core language." "This sections [sic] defines the built-in SHACL constraint components that MUST be supported by all SHACL validation engines." "Not all SHACL validation engines need to support this variable." "The same support policies as for $shapesGraph apply for this variable." "SPARQL engines with full SHACL support can install a new SPARQL function based on the SPARQL 1.1 Extensible Value Testing mechanism." "SHACL validation engines are not required to support any entailment regimes." "SHACL implementations with full support of the SHACL SPARQL extension mechanism must implement a function sh:hasShape, ...." "A SHACL validation engine MUST implement all constructs in the Core of SHACL (Sections 2, 3, 4). A SHACL engine MAY not implement the other parts of SHACL." "Implementations that cover only the the SHACL Core features are not required to implement these mechanisms or the sh:hasShape function." "SHACL validation engines MAY pre-bind the variable $shapesGraph to provide access to the shapes graph." "A SHACL validation engine MAY use such suggestions to determine which shapes graph to use for validating a data graph." "A SHACL validation engine MAY take this information into account to determine which shapes graph to use for validating a data graph that uses that ontology or vocabulary."
There needs to be a section that explicitly defines the different levels of implementation.
- Comment (HK): Not sure what to do about this. There is an almost infinite amount of combinations of these above, so one could define many dialects. But only one of them is the full SHACL. I would prefer all SHACL engines to support all these features but there was too much resistance, e.g. from those favoring a single-query-code-generation approach or working against SPARQL end points. The resulting mess is reflecting the heterogeneous nature of the SPARQL universe, whether we want it or not.
- Comment (DK): What if we created a section at the end of part II called "Optional features of the SHACL SPARQL extension mechanism" (or something similar) where we list all option features
- Comment (HK): Ok, I have added an appendix with the goal of enumerating all optional features. Could you double check this: https://github.com/w3c/data-shapes/commit/e198bc9689c95e89e8caeb8c3c787b9efa579856
order of processing for filters
The discussion of how filters are processed appears to be contradictory. First there is: "SHACL validation engines MAY alter the order of the depicted steps as long as the returned validation results are correct." Later there is: "Filter shapes MUST be evaluated before validating the associated shapes or constraints."
- Comment (HK): Yes, the first sentence is IMHO incorrect and I have taken it out (https://github.com/w3c/data-shapes/commit/3777e8e80aec9f9c1ba1bbb0dfdfce2b2acb9a12). The problem is that if an engine does filtering after validation, it may run into a failure that is otherwise not reached. I don't remember why we added that statement in the first place, do you @Dimitris?
- Comment (DK): This was changed to address a comment from Peter on March 7th and resulted in this commit
The status of $shapesGraph is unclear: "SPARQL variables using the $ marker represent external values that must be pre-bound or substituted in the SPARQL query before execution." "SHACL validation engines MAY pre-bind the variable $shapesGraph to provide access to the shapes graph."
- Comment (HK): The MAY is clarified in the following sentence (Access to the shapes graph is not a requirement etc). I believe it would be confusing to soften up the must in the first sentence because of this exception.
What happens if a shape is one of its own filters?
- Comment (HK): The same as with other recursive scenarios - it's undefined.
EXISTS and blank nodes
The definition of ASK binds the value variable and then uses it inside an EXISTS. The definition of SPARQL provides a counter-intuitive result if this variable is bound to a blank node, resulting in, for example, a sh:class constraint with class ex:C returning no violation for _:d in any data graph containing the triple
ex:c rdf:type ex:C .
- Comment (HK): We are awaiting input from the SPARQL Maintenance (EXISTS) community group.
union operations on data graphs and shapes graphs
It is unclear just what the data graph and the shapes graph are. There is wording that both of these cannot be changed. However, there is also wording that various kinds of union operations are to be performed on shapes and data graphs.
- Comment (HK): The only place I could find "union" was about handling of owl:imports, which states that the result of this union is used as shapes graph. This looks OK to me. Could you clarify what you mean?
- Comment (DK): I tried to make the wording clearer here: https://github.com/w3c/data-shapes/commit/b6fd2db5719cc9c9bdec464acdd2aefc8d0b5b68
It is unclear what is meant by: "The variable $targetNode is assumed to be pre-bound to the given value of sh:targetNode." Is this something that SHACL implementations have to do? There are several occurences of this kind of wording.
- Comment (HK): I don't see anything wrong here. "is assumed to" is IMHO OK because this section is merely describing the formal semantics without prescribing an implementation. Implementations will (almost certainly) not use a SPARQL query.
MAY is used in 1.5 but defined in 1.6
- Comment (HK): Ok, moved higher up https://github.com/w3c/data-shapes/commit/bda4e2c4781494ac0e26eb132c7e7dae15932739
"A SHACL engine MAY not implement the other parts of SHACL." reads as if no SHACL engine is allowed to implement any non-core part of SHACL.
- Comment (HK): See https://github.com/w3c/data-shapes/commit/2ba049e6e39096bf47355b03d1de02c2e0e84f59
"The data graph SHOULD include all the ontology axioms related to the data and especially all the rdfs:subClassOf triples in order for SHACL to correctly identify class targets and validate Core SHACL constraints." Data graphs are just graphs. How thus can SHOULD be applied to them?
- Comment (HK): I have replaced the SHOULD with "is expected to": https://github.com/w3c/data-shapes/commit/fd3fbeac7826f9df87111af878e65e34a502331c
"A SHACL validation engine MAY use such suggestions to determine which shapes graph to use for validating a data graph." Can this be done even when an explicit shapes graph is provided to the engine?
- Comment (HK): Attempted to clarify at https://github.com/w3c/data-shapes/commit/601631a5f4b965fa79f7b44a5a348702326ef315
Different shapes graph
"The same mechanism applies for ontologies or vocabularies included in the shapes graph. The ontology or the vocabulary IRI can point to one or more shapes graphs with the predicate sh:shapesGraph. A SHACL validation engine MAY take this information into account to determine which shapes graph to use for validating a data graph that uses that ontology or vocabulary." If there already is a shapes graph in play, why is there any need for a different shapes graph to be used?
- Comment (HK): I have changed the prose to clarify that sh:shapesGraph only points at graphs, not shape graphs: https://github.com/w3c/data-shapes/commit/c88df2cf50cbc5f31feaabf610a0143d3ebcf0fb
- Comment (DK): I removed the "in the shapes graph" here. This was meant as a general property for ontology design not only when it is used in one of the shapes/data graph
"a deep copy of sh:path as its sh:path" What is "deep copy" in this context?
- Comment (HK): I have attempted to clarify this here: https://github.com/w3c/data-shapes/commit/d3f8f858f95b010d1f2a0e4681da203bcbfbc217
- Comment (kc): Unless "deep copy" has some pre-defined meaning that I am unaware of, I would suggest dropping it and saying: The value of
sh:pathof each validation result must copy all triples that are required by the <a href="#path-syntax">SHACL well-formed path syntax rules</a>from the <a>shapes graph</a> into the graph containing the validation results.
- Comment (HK): The first google match of "deep copy" is pretty close to what I wanted to express, so I believe the term should be familiar to many people and may be helpful for implementers. Also I had surrounded the term with "...". Anyway, I have no strong opinion and let others decide.
"A filter is a shape in a shapes graph that can be used to limit the nodes that are validated against a given constraint or shape." Are there some filters that cannot be used in this way? Which ones?
- Comment (HK): I don't understand this comment. The current statement does not exclude any filters from being used this way.
- Comment (DK): This commit should fix this issue.
"The following table enumerates variables that have special meaning in SPARQL constraints. When SPARQL constraints are executed, the validation engine should pre-bind values for these variables." However, many other variables also need to be pre-bound, such as the variables corresponding to parameters.
- Comment (HK): First, the statement above does not exclude other variables from being pre-bound. It doesn't claim that the table contains "all" variables. Second, this is in a chapter about SPARQL Constraints, where parameters have no meaning. So I don't think anything is wrong here.
- Comment (DK): I think this commit helps more with this issue. I am not sure if we should move that table in the prebinding section since it affectd prebinding as a whole, not only SPARQL constraints