Warning:
This wiki has been archived and is now read-only.

Public Comments

From RDF Data Shapes Working Group
Jump to: navigation, search

Peter's Email 2016-08-16

https://lists.w3.org/Archives/Public/public-rdf-shapes/2016Aug/0005.html

Here are a few of the problems with the current public working draft I found during a quick scan of it.

pre-binding

SPARQL does not evaluate variables that occur in basic graph patterns. This means that the definition of pre-binding has unusual behaviour. For example, the normative SPARQL definition of sh:class will return validation results for every pair of nodes in the graph such that there is an rdf:type/rdfs:subClass* path from the first to the second.

This problem affects many parts of the definition of SHACL. It means that the normative definition of many SHACL constructs is counter to intuitions. This problem is not ameliorated by the caution box in Appendix B.

  • Comment (HK): WG is waiting for input from the SPARQL EXISTS CG on this topic.

syntax of SPARQL variables

SPARQL treats $ and ? as equivalent so $PATH and ?PATH both refer to the PATH variable. SHACL uses $ as a special marker and includes $ and ? as part of the variable.

Would ?PATH be substituted as $PATH is? If a SPARQL query for a SHACL constraint only used ?this would the variable this be pre-bound?

pre-binding optional?

"SPARQL variables using the $ marker represent external values that must be pre-bound or substituted in the SPARQL query before execution." "When SPARQL constraints are executed, the validation engine should pre-bind values for these variables." Are some $-marked variables not necessarily pre-bound, counter to the earlier requirement?

$PATH vs other $-prefixed variables

The variable PATH is treated specially in SHACL. However, the general description of $ does not specially call out PATH: "SPARQL variables using the $ marker represent external values that must be pre-bound or substituted in the SPARQL query before execution."

$value

$value is used in many ASK queries. However the definition of ASK validators does not appear to pre-bind value.

  • Comment (HK): 4.1 states "These queries are interpreted against each value node, bound to the variable value." A similar statement exists in section 6.4.2. So I am not sure what is missing here.

aggregation

The prohibition "Furthermore, any query that uses the variable $this in an aggregation is invalid." is vague. It appears to disallow the use of $this in any part of the SPARQL 1.1 aggregation machinery, as the pointer in the sentence is to Section 11 of the SPARQL specification. This would rule out all of the examples of aggregation in the SHACL document.

ASK validators syntax

The syntax for ASK queries in SPARQL 1.1 is

 "ASK" DatasetClause* WhereClause SolutionModifier

The syntax for WhereClause is

 'WHERE'? GroupGraphPattern

The syntax for EXISTS constructs SPARQL 1.1 is

 'EXISTS' GroupGraphPattern

Stripping the ASK from the beginning of an ASK query does not generally end up with a GroupGraphPattern that can be used as the argument for EXISTS.

It appears that the values of sh:ask are never used as ASK queries by SHACL processors. Why then are these of the form of ASK queries?

  • Comment (HK): While in theory we could have stated GroupGraphPattern, I think ASK is more intuitive to explain and allows stand-alone execution with copy and paste. Furthermore they align with the use of functions, which can also have ASK queries as their bodies.

different levels of SHACL implementation

There are several different kinds of SHACL implementations that are hinted at in the document.

"SHACL implementations may, but are not required to, support entailment regimes." "Access to the shapes graph is not a requirement for supporting the SHACL Core language." "This sections [sic] defines the built-in SHACL constraint components that MUST be supported by all SHACL validation engines." "Not all SHACL validation engines need to support this variable." "The same support policies as for $shapesGraph apply for this variable." "SPARQL engines with full SHACL support can install a new SPARQL function based on the SPARQL 1.1 Extensible Value Testing mechanism." "SHACL validation engines are not required to support any entailment regimes." "SHACL implementations with full support of the SHACL SPARQL extension mechanism must implement a function sh:hasShape, ...." "A SHACL validation engine MUST implement all constructs in the Core of SHACL (Sections 2, 3, 4). A SHACL engine MAY not implement the other parts of SHACL." "Implementations that cover only the the SHACL Core features are not required to implement these mechanisms or the sh:hasShape function." "SHACL validation engines MAY pre-bind the variable $shapesGraph to provide access to the shapes graph." "A SHACL validation engine MAY use such suggestions to determine which shapes graph to use for validating a data graph." "A SHACL validation engine MAY take this information into account to determine which shapes graph to use for validating a data graph that uses that ontology or vocabulary."

There needs to be a section that explicitly defines the different levels of implementation.

  • Comment (HK): Not sure what to do about this. There is an almost infinite amount of combinations of these above, so one could define many dialects. But only one of them is the full SHACL. I would prefer all SHACL engines to support all these features but there was too much resistance, e.g. from those favoring a single-query-code-generation approach or working against SPARQL end points. The resulting mess is reflecting the heterogeneous nature of the SPARQL universe, whether we want it or not.
  • Comment (DK): What if we created a section at the end of part II called "Optional features of the SHACL SPARQL extension mechanism" (or something similar) where we list all option features
  • Comment (HK): Ok, I have added an appendix with the goal of enumerating all optional features. Could you double check this: https://github.com/w3c/data-shapes/commit/e198bc9689c95e89e8caeb8c3c787b9efa579856

order of processing for filters

The discussion of how filters are processed appears to be contradictory. First there is: "SHACL validation engines MAY alter the order of the depicted steps as long as the returned validation results are correct." Later there is: "Filter shapes MUST be evaluated before validating the associated shapes or constraints."

  • Comment (HK): Yes, the first sentence is IMHO incorrect and I have taken it out (https://github.com/w3c/data-shapes/commit/3777e8e80aec9f9c1ba1bbb0dfdfce2b2acb9a12). The problem is that if an engine does filtering after validation, it may run into a failure that is otherwise not reached. I don't remember why we added that statement in the first place, do you @Dimitris?
  • Comment (DK): This was changed to address a comment from Peter on March 7th and resulted in this commit

$shapesGraph

The status of $shapesGraph is unclear: "SPARQL variables using the $ marker represent external values that must be pre-bound or substituted in the SPARQL query before execution." "SHACL validation engines MAY pre-bind the variable $shapesGraph to provide access to the shapes graph."

  • Comment (HK): The MAY is clarified in the following sentence (Access to the shapes graph is not a requirement etc). I believe it would be confusing to soften up the must in the first sentence because of this exception.

circular filters

What happens if a shape is one of its own filters?

  • Comment (HK): The same as with other recursive scenarios - it's undefined.

EXISTS and blank nodes

The definition of ASK binds the value variable and then uses it inside an EXISTS. The definition of SPARQL provides a counter-intuitive result if this variable is bound to a blank node, resulting in, for example, a sh:class constraint with class ex:C returning no violation for _:d in any data graph containing the triple

 ex:c rdf:type ex:C .
  • Comment (HK): We are awaiting input from the SPARQL Maintenance (EXISTS) community group.

union operations on data graphs and shapes graphs

It is unclear just what the data graph and the shapes graph are. There is wording that both of these cannot be changed. However, there is also wording that various kinds of union operations are to be performed on shapes and data graphs.

$targetNode

It is unclear what is meant by: "The variable $targetNode is assumed to be pre-bound to the given value of sh:targetNode." Is this something that SHACL implementations have to do? There are several occurences of this kind of wording.

  • Comment (HK): I don't see anything wrong here. "is assumed to" is IMHO OK because this section is merely describing the formal semantics without prescribing an implementation. Implementations will (almost certainly) not use a SPARQL query.

MAY

MAY is used in 1.5 but defined in 1.6

MAY 2

"A SHACL engine MAY not implement the other parts of SHACL." reads as if no SHACL engine is allowed to implement any non-core part of SHACL.

Graphs SHOULD

"The data graph SHOULD include all the ontology axioms related to the data and especially all the rdfs:subClassOf triples in order for SHACL to correctly identify class targets and validate Core SHACL constraints." Data graphs are just graphs. How thus can SHOULD be applied to them?

Suggestions

"A SHACL validation engine MAY use such suggestions to determine which shapes graph to use for validating a data graph." Can this be done even when an explicit shapes graph is provided to the engine?

Different shapes graph

"The same mechanism applies for ontologies or vocabularies included in the shapes graph. The ontology or the vocabulary IRI can point to one or more shapes graphs with the predicate sh:shapesGraph. A SHACL validation engine MAY take this information into account to determine which shapes graph to use for validating a data graph that uses that ontology or vocabulary." If there already is a shapes graph in play, why is there any need for a different shapes graph to be used?

Deep copy

"a deep copy of sh:path as its sh:path" What is "deep copy" in this context?

  • Comment (HK): I have attempted to clarify this here: https://github.com/w3c/data-shapes/commit/d3f8f858f95b010d1f2a0e4681da203bcbfbc217
  • Comment (kc): Unless "deep copy" has some pre-defined meaning that I am unaware of, I would suggest dropping it and saying: The value of sh:path of each validation result must copy all triples that are required by the <a href="#path-syntax">SHACL well-formed path syntax rules</a>from the <a>shapes graph</a> into the graph containing the validation results.
  • Comment (HK): The first google match of "deep copy" is pretty close to what I wanted to express, so I believe the term should be familiar to many people and may be helpful for implementers. Also I had surrounded the term with "...". Anyway, I have no strong opinion and let others decide.

Filter role

"A filter is a shape in a shapes graph that can be used to limit the nodes that are validated against a given constraint or shape." Are there some filters that cannot be used in this way? Which ones?

  • Comment (HK): I don't understand this comment. The current statement does not exclude any filters from being used this way.
  • Comment (DK): This commit should fix this issue.

Incomplete table

"The following table enumerates variables that have special meaning in SPARQL constraints. When SPARQL constraints are executed, the validation engine should pre-bind values for these variables." However, many other variables also need to be pre-bound, such as the variables corresponding to parameters.

  • Comment (HK): First, the statement above does not exclude other variables from being pre-bound. It doesn't claim that the table contains "all" variables. Second, this is in a chapter about SPARQL Constraints, where parameters have no meaning. So I don't think anything is wrong here.
  • Comment (DK): I think this commit helps more with this issue. I am not sure if we should move that table in the prebinding section since it affectd prebinding as a whole, not only SPARQL constraints