Re: resolving ISSUE-47: Can SPARQL-based constraints access the shape graph, and how? from Holger Knublauch on 2015-06-11 (public-data-shapes-wg@w3.org from June 2015)

From: Holger Knublauch <holger@topquadrant.com>
Date: Fri, 12 Jun 2015 09:57:30 +1000
To: public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <557A206A.2020402@topquadrant.com>
On 6/12/2015 8:14, Dimitris Kontokostas wrote:
>
>
> On Thu, Jun 11, 2015 at 11:18 PM, Holger Knublauch 
> <holger@topquadrant.com <mailto:holger@topquadrant.com>> wrote:
>
>     On 6/12/15 5:51 AM, Dimitris Kontokostas wrote:
>
>
>         Summing up from the meeting, the whole core language can be
>         implemented without access to the constraints graph.
>
>
>     Could you clarify or give examples? I believe this requires a
>     substantial change from a template-based generic approach to a
>     solution in which these core templates require a different,
>     hard-coded mechanism to produce SPARQL queries. I would find the
>     latter very ugly and it adds to the implementation burden. 
>
>
> Implementers who do not want to support SPARQL Endpoints can do 
> optimizations like the ones in the current spec.

Yes, possibly. Still the overall spec becomes more complex because we 
need to introduce hard-coded patterns.

>     One of the very strengths of RDF and OWL is that people can use
>     the same mechanism to query data and ontology, and here we are
>     simply allowing that same principle.
>
>
> I agree and this is the reason we define SHACL in RDF. However, I 
> don't think ontologies are required to exist / queried together with data.
>
>
>         Some parts of the spec would indeed be simpler with access to
>         the constraints graph but I don't see this alone as a reason
>         to require access.
>
>         I suggest we gather all the use cases where access is needed
>         beyond the core language and evaluate them.
>
>
>     One case is recursion (e.g. sh:valueShape). How could this be
>     implemented without a function such as sh:hasShape, which takes a
>     shape as a parameter?
>
>
> There is no resolution for recursion yet.

Yes, and here things become intertwined. We could try to vote on the 
recursion issue beforehand. If we vote in favor of recursion with 
sh:hasShape, then the implication is that we must also vote in favor of 
?shapesGraph access. This is my preferred outcome. However, due to these 
implications, some people may vote against recursion (like Peter did in 
the past) to make sure that a code-generation approach always remains a 
viable option.

To break this deadlock, I can only try my best to work with you on your 
performance concerns in the dbpedia use case. One obvious solution for 
you would be to wait for Virtuoso's native SHACL support - assuming 
OpenLink is committed to this, which I cannot comment on. AFAIK dbpedia 
is already running on their platform, so you would just need to create a 
named graph for your shapes on the same Virtuoso instance, and let the 
engine do all the rest natively. This would be by far the fastest 
solution anyway. Meanwhile, you may need to live with slower performance 
for certain cases.

Having said this, we should look at which cases we are really talking 
about now. We agree that any SHACL implementation could optimize the 
core templates such as sh:ClosedShape, so that they no longer require 
access to the shapes graph. This is relatively straight-forward to 
implement in Java. So the only case where you wouldn't get the native 
execution speed would be for custom templates or constraints. Assuming 
you have these constraints under your control, you can easily make sure 
that they don't use ?shapesGraph. So the only problem case is if
- you need to talk to a SPARQL end point
- the SPARQL end point doesn't have the shapes graph as a named graph
- you have queries that need to access ?shapesGraph

To me this feels rather like a corner case, that shouldn't force our 
hands with the design of the whole spec. If SHACL is as successful as we 
all hope, then most vendors will sooner or later be forced to add native 
SHACL support, through market pressure. If it isn't successful, then, 
well, you have the fallback of using the same hand-written SPARQL 
queries that you currently have, and bypass the SHACL machinery.

Furthermore, did we even formally decide that SPARQL end points are 
supported? I believe we only ever talked about datasets, and in that 
case the SPARQL end point would have to be wrapped into a Graph (with 
SPO queries).

> Besides this, are there any other use cases beyond the core language?

As discussed: any use case with sh:hasShape. But of course it's a 
nice-to-have feature. People will not even think about what they could 
do, if we don't allow this feature. I would be curious to observe what 
our user base will produce with this feature - it is very powerful!

Holger


>
>     As Peter stated, if we don't allow access to the shapes graph,
>     then a lot of things in the current design would need to change. I
>     think we should take a hard look at the counter arguments before
>     making such a change. I accept there may be performance
>     implications for scenarios such as remote SPARQL execution against
>     large databases, but in those cases there are work-arounds such as
>     generating optimized SPARQL queries. Many engines may decide to
>     implement optimizations for the core language anyway. But since
>     these are performance optimizations only, I don't think they
>     should limit what the general spec allows.
>
>
> Querying big SPARQL endpoints is already a slow process and with this 
> approach SHACL makes it even slower.
> We have a few SPARQL vendors in the WG, I am wondering about their 
> opinion on this.
>
> Dimitris
>
>
>
>     Holger
>
>
>
>
>
> -- 
> Dimitris Kontokostas
> Department of Computer Science, University of Leipzig & DBpedia 
> Association
> Projects: http://dbpedia.org, http://http://aligned-project.eu, 
> http://rdfunit.aksw.org
> Homepage:http://aksw.org/DimitrisKontokostas
> Research Group: http://aksw.org
>
Received on Thursday, 11 June 2015 23:59:43 UTC