Re: resolving ISSUE-47: Can SPARQL-based constraints access the shape graph, and how? from Dimitris Kontokostas on 2015-06-15 (public-data-shapes-wg@w3.org from June 2015)

From: Dimitris Kontokostas <jimkont@gmail.com>
Date: Mon, 15 Jun 2015 07:16:54 +0300
To: Holger Knublauch <holger@topquadrant.com>
Cc: public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <CA+u4+a2QEuMdUHEdq_5zu5zYCu-xuhvrpVdWLGB0T6AE06yVAQ@mail.gmail.com>
On Jun 15, 2015 2:11 AM, "Holger Knublauch" <holger@topquadrant.com> wrote:
>
> On 6/14/15 10:56 PM, Dimitris Kontokostas wrote:
>>
>> Maybe I am wrong but I think you are aiming for corner cases. I asked if
there are use cases for access beyond core and you didn't say anything
besides a "nice-to-have feature" and that you "would be curious to observe
what our user base will produce with this".
>
>
> Why should the core language have a special status here? The constructs
of the core language are just one selection of common use cases - other
people will add their own. Currently the following core language constructs
are defined using ?shapesGraph: sh:allowedValues, sh:valueShape,
sh:NotConstraint, sh:AndConstraint, sh:OrConstraint, sh:XorConstraint,
sh:ClosedShape, plus the (not yet approved) qualified cardinality
constraints. Any recursive definition would require this. These are not
corner cases at all but just random examples of real-world scenarios.
>
> Here is another case, for named graph access in general. Assume you want
to validate that certain terms from a given query graph are also present as
SKOS concepts in some other graph. That SKOS named graph is not accessible
to the server running dbpedia. With the general SPARQL endpoint scenario
this is not implementable unless the endpoint can call out to external
named graphs. So endpoints are very limited already, but this limitation
shouldn't propagate into every use case.
>

This is just applying shacl in the union of two graphs. Not related to the
shacl graph or ?shapesGraph

>
>>
>> On the other hand, I think everyone agrees that allowing arbitrary
access is problematic in immutable RDF datasets.
>
>
> There are work-arounds for these cases, by wrapping the immutable
datasets with a virtual dataset. OTOH you didn't mention work-arounds for
the recursion issue yet,

(See later for details related to core.)
Sound and complete recursion is hard to optimize but a workaround would be
possible with precomputed prevalence and detection queries up to a fixed
level.

Regarding recursion, IMHO it might be convenient for some cases but there
are very few cases where it is actually needed and supporting it will not
be easy.

> nor SHACL functions nor blank node treatment.

I don't see anything special with blank nodes. OK Jenna has some additional
utility functions but there spec shouldn't rely on third party libraries.

> Instead you are suggesting a lowest-common-denominator approach in which
everyone can only use the features that the weakest link can also support.
This is IMHO a design mistake.
>
> We already have a separation between (ShEx) engines that don't want to
support SPARQL. In the worst case, we may also have modes that don't
support ?shapesGraph, while allowing others to use that feature.
>
>
>> We had a similar case for global constraints where we could find a
couple of examples but decided to drop them for the same reason.
>>
>> In case it wasn't clear, my suggested resolution refers only to the
SPARQL extension mechanism beyond the core language. What I suggest does
not refer to SHACL core or the spec.
>> Of course, I would be happy to re-consider if there is evidence that
this feature is indeed needed.
>
>
> So did you say that ?shapesGraph can still be used inside of the core
language, i.e. the engine would have to support it anyway? This would be
better, but then why not also allow it in general? Queries that use
?shapesGraph are easy to spot and engines can fall back to the default
(e.g. by using Jena's own SPARQL engine). Nobody is forced to use
?shapesGraph.

That is my suggestion.
I don't like the general use because validation might break due to a
dependency in a shacl library that uses ?shapesGraph
People might use this for the wrong reasons so I would like to forbid this
if there is no actual need.
We can always reconsider later on the rec process or the next shacl version.

>
>
>>
>> Also note that although SPARQL Endpoints are one case where access is
problematic, there can be many others when e.g. we apply SHACL in a
distributed processing system or a map-reduce step where we'll have the
 same limitations.
>
>
> Even the distributed scenarios would require named graph support, e.g.
when someone has a GRAPH ?x statement in their SPARQL. So the main
differentiator seems to be immutability, i.e. the case in which it is
impossible to have the shapes graph as one named graph in the same dataset.

Graph support is not enforced to be supported while the shacl graph will be.

>
> What do you think about my proposal to specify the SPARQL generation
rules in a separate document or chapter? Wouldn't this be a compromise that
addresses all use cases?

That would definitely make the spec more complete - if the WG agrees on
this addition.

Dimitris
>
> Thanks
> Holger
>
Received on Monday, 15 June 2015 04:17:24 UTC