Re: ISSUE-139: uniform descriptions and implementations of constraint components from Peter F. Patel-Schneider on 2016-06-06 (public-data-shapes-wg@w3.org from June 2016)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Mon, 6 Jun 2016 07:54:28 -0700
To: Holger Knublauch <holger@topquadrant.com>, public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <0d48dca6-a49d-fd02-a467-5e93252b5941@gmail.com>
On 06/05/2016 11:14 PM, Holger Knublauch wrote:
> On 6/06/2016 6:36, Peter F. Patel-Schneider wrote:
>> Yes, each constraint component should not need more than one implementation,
>> whether it is in the core or otherwise.
> 
> While I share the same goal, I don't see how it can work in practice. You have
> not yet responded to how this would look in cases like sh:hasValue where being
> forced to use the query snippet for node constraints and property constraints
> would lead to abysmal performance.

I don't see how a different implementation of sh:hasValue is going to lead to
abysmal performance.   The difference is going to roughly be something like

SELECT $this WHERE { FILTER NOT EXISTS { $this $property $hasValue } }

vs

SELECT $this WHERE { FILTER NOT EXISTS { $this $property ?value
                                FILTER ( sameTerm(?value,$hasValue) } }

>>    Otherwise there are just that many more ways of introducing an error.
> 
> This is IMHO not a strong enough argument to *prevent* multiple queries. It's
> a nice-to-have. No need to cripple the language just for that. If an extension
> developer creates a poorly tested and broken extension then it's a bug that he
> or she needs to fix. That's the same as everywhere.

I don't see this as crippling the language.  I instead see this as improving
the language by making it more regular.

>> Yes, in the current setup each constraint component should be usable in node
>> constraints, in property constraints, and in inverse property constraints.
>> Otherwise there is an extra cognitive load on users to figure out when a
>> constraint component can be used.  The idea is to not have errors result from
>> these extra uses, though.  Just as sh:minLength does not cause an error when a
>> value node is a blank node neither should sh:datatype cause an error when used
>> in an inverse property constraint.  Of course, an sh:datatype in an inverse
>> property constraint will always be false on a data graph that is not an
>> extended RDF graph.
> 
> I completely disagree. Compared with OWL, this policy would mean that any
> constraint-like property should be applicable everywhere. For example in
> addition to
> 
> ex:Person
>     a owl:Class ;
>     owl:disjointWith ex:Animal ;
>     rdfs:subClassOf [
>         a owl:Restriction ;
>         owl:onProperty ex:gender ;
>         owl:maxCardinality 1 ;
>     ] .
> 
> the policy that you propose would also allow
> 
> ex:Person
>     a owl:Class ;
>     owl:maxCardinality 1 ;
>     rdfs:subClassOf [
>         a owl:Restriction ;
>         owl:onProperty ex:gender ;
>         owl:disjointWith ex:Animal ;
>     ] .
>
> Do you remember why the OWL WG did not apply the same policy that you describe?

In OWL there is a strong limitation on how property restrictions are formed.
The only allowable property restrictions are the object and data property
versions of AllValuesFrom, SomeValuesFrom, HasValue, HasSelf, and
cardinalities.  Each of these has a particular meaning defined in the OWL
specification.

So
[        a owl:Restriction ;
         owl:onProperty ex:gender ;
         owl:disjointWith ex:Animal ;
     ]
is semantically incomplete.  Is it all the ex:gender values that are being
restricted?  Is it some of them?  Is it some number of them?  Is it something
else?  There is no way of knowing.

Similarly in OWL the pieces of property restrictions are very constrained in
where they can occur each different kind of property restriction has its own
syntax and the properties that connect a property restriction to its pieces in
the RDF encoding for OWL are only used for this purpose.  So
owl:maxCardinality is only used in property restrictions and not elsewhere.

At one point during its early development OWL did have a syntax closer to that
of SHACL but this syntax was abandoned because of a desire for a more limited
syntax with clean boundaries.  There is no syntactic benefit in the current
OWL syntax to allowing, for example, cardinalities on classes so there was no
syntactic reason to think of doing so.

There are description logics that allow cardinalities on classes but because
of expressive needs, not syntactic concerns.   This extra expressive power
often affects the computational complexity of reasoning, so there are few, if
any, implementations of it.

> In the case of SHACL you would allow something like
> 
> ex:PersonShape
>     a sh:Shape ;
>     sh:maxCount 1 ;
>     sh:property [
>         sh:predicate ex:gender ;
>         sh:closed true ;
>     ] .
> 
> (Note the sh:maxCount would be utterly confusing to people, needlessly
> increasing the cognitive load.)

Currently in SHACL this would look like
 ex:PersonShape sh:Shape ;
  sh:constraint [
     sh:maxCount 1 ;
     sh:shape [ sh:property [
         sh:predicate ex:gender ;
         sh:closed true ;
     ] ] ] .

>From the SHACL spec:
- For node constraints the value nodes are the individual focus nodes, forming
a set of exactly one node.
- The property sh:maxCount restricts the number of value nodes.
This is perfectly well-behaved.

Also from the SHACL spec
- For property constraints the value nodes are the objects of the triples that
have the focus node as subject and the given property as predicate.
Right now sh:closed is couched in terms of the focus node as it is only
allowed for node constraints.  Using value node instead gives a description
that is suitable here.

> There are good reasons why languages are designed to disallow certain
> nonsensical statements: they support "compile-time" checking of syntax errors,
> and enable input forms to suggest relevant properties.

There is a vast difference between nonsensical and silly.  A nonsensical
statement is one that cannot be assigned a meaning, as in an OWL property
restriction missing the part that says how to treat the property values.  A
statement can be silly for a number of reasons, but silly statements to have
well-defined meanings.

> I would also appreciate a response to the case of primary keys from my
> previous email. 

Are you proposing that there should be a constraint component for primary
keys?  I don't see any description of how this would work in property
constraints so how can anyone determine how it would work in other
constraints.  If you are not proposing that there be a constraint component
for primary keys then I don't see any relevance to the discussion here.

> Using them in inverse property constraints would be
> meaningless and misleading. 

Well, that depends on what a primary key is supposed to be.  I don't see any
particular SHACL or RDF reason that primary keys can't be inverse properties
just as well as non-inverse properties.

> There must be a way for extension developers to
> indicate for which cases a constraint component can be used. 

Why?  When is this necessary?

> sh:context is
> playing that role. We could potentially use Shapes and sh:scopeClass instead,
> but then the meta-shapes would overlap with the actual data shapes.
> 
> I believe the general problem that we have again and again is that you (Peter)
> seem to focus on the constraint validation aspect only, while I (and hopefully
> others) also want workable support for the other use cases of SHACL such as
> form generation. From a pure validation perspective, it may indeed be possible
> to formulate queries for all three cases, even if they are dummies. But from a
> users' perspective this makes no sense. And even from an implementation point
> of view, forcing all the 3 cases everywhere is an extra burden. I believe
> cutting down to a maximum of two queries will be acceptable and is easily
> achievable without the drastic redesign you are proposing.

Well I certainly do believe that the validation is the primary use of SHACL.
If SHACL doesn't work, or doesn't work well, for validation then it is not
going to be used.

I see lots of problems with the current design of SHACL as a validation
language.  I view my proposals as improvements to SHACL.

> I wouldn't mind walking through the core vocabulary again to see if we can
> further generalize some components. For example sh:hasShape could be applied
> to node constraints too (pending a renaming probably). But there still needs
> to be a generic context mechanism for extension authors. As a tool developer I
> need that feature.

Where is this ability needed?  For a system that limits what kind of SHACL
constraints can be easily written?  No problem - just use your own property to
limit what the form generator of the system does.

> Holger

peter



> 
>>
>> peter
>>
>>
>> On 06/05/2016 09:57 AM, Dimitris Kontokostas wrote:
>>> So, this goes into the SPARQL extension mechanism, which also affects the
>>> definition of the  core language and, with what you propose,
>>> - there should be _only one_ SPARQL query that will address all three
>>> contexts, and any other contexts we may introduce in the future (e.g. for
>>> paths)
>>> - even if it doesn't make sense in some cases or even if it would result in an
>>> error when used, all contexts will be enabled for all components with this one
>>> generic SPARQL query, right?
>>>
>>> (apologies if you discussed this already on the last telco)
>>>
>>> Dimitris
>>>
>>>
>>> On Sun, Jun 5, 2016 at 6:06 PM, Peter F. Patel-Schneider
>>> <pfpschneider@gmail.com <mailto:pfpschneider@gmail.com>> wrote:
>>>
>>>      My recent messages have been about constraint components in general.  Of
>>>      course, the examples of constraint components that are easiest to
>>> discuss are
>>>      the core constraint components, and when discussing core constraint
>>> components
>>>      there are also issues related to how they are described in the SHACL
>>>      specification.
>>>
>>>
>>>      Right now constraint components in both the core and in the extension
>>> have the
>>>      potential for tripartite behaviour - one behaviour in node
>>> constraints, one
>>>      behaviour in property constraints, and one behaviour in inverse property
>>>      constraints.  No core constraint components actually work this way at
>>> present,
>>>      but the potential is there.  This should be changed so that constraint
>>>      components, both in the core and in the extension, have a common
>>> behaviour for
>>>      node constraints, property constraints, and inverse property constraints.
>>>
>>>      Not only should each constraint component have a single behaviour, but
>>>      constraint components should share common behaviours.  Right now it is
>>>      possible for a constraint component to do something completely
>>> different with
>>>      the property.  For example, a constraint component could decide to use
>>> the
>>>      constraint's property in an inverse direction in both inverse property
>>>      constraints and property constraints or could just ignore the property
>>>      altogether.
>>>
>>>      Further, there should also be a common description of this behaviour
>>> common to
>>>      constraint components.  Some core constraint components, e.g.,
>>> sh:class, are
>>>      already described using a terminology, namely value nodes, that can
>>> easily
>>>      apply to all constraint components.  Other constraint components, such as
>>>      sh:minCount and sh:equals, are described using different terminology that
>>>      describes the same thing.  This makes sh:minCount and sh:equals appear
>>> to be
>>>      quite different from sh:class.  Either the descriptions should align
>>> or there
>>>      should be different syntactic categories for sh:class and sh:minCount.
>>>
>>>      It is also possible to resolve this problem by using a different
>>> syntax for
>>>      SHACL.  ShEx does this by having a single property-crossing
>>> construct.  OWL
>>>      does this by having multiple property-crossing constructs, including
>>>      ObjectAllValuesFrom and ObjectSomeValuesFrom.  In both ShEx and OWL
>>> there are
>>>      many constructs, including the analogues of sh:class, that then just
>>> work on
>>>      single entities with no need to worry about focus nodes vs value nodes or
>>>      properties vs inverse properties.
>>>
>>>
>>>      Along with the problems of differing behaviour and description there
>>> is also
>>>      the problem of tripartite implementations, both of core and extended
>>>      constraint components.  Why should sh:class need three pointers to
>>>      implementations, even if they are the same?  Why should sh:minCount
>>> need two
>>>      (or three) implementations?
>>>
>>>      One could say that this doesn't matter at all because SHACL
>>> implementations
>>>      are free to implement core constructs however they see fit.  However,
>>> this
>>>      implementation methodology is completely exposed for constraint
>>> components in
>>>      the extension.  It would be much better if only a single simple
>>> implementation
>>>      was needed for each constraint component.  It would also be much
>>> better if the
>>>      implementations of constraint components did not need to worry about
>>> how value
>>>      nodes are determined.
>>>
>>>
>>>      So my view is that SHACL is currently has the worst of all possible
>>> worlds.
>>>      Its syntax is complex, because each constraint component has its own
>>> rules for
>>>      where it can occur.  Its behaviour is complex, because each constraint
>>>      component decides how to behave in each kind of constraint.  Its
>>> description
>>>      is complex, because different constraint components are described in
>>> different
>>>      ways.  Its implementation is complex, because constraint components
>>> can have
>>>      up to three different implementations each of which is often more
>>> complex than
>>>      necessary.
>>>
>>>      peter
>>>
>>>
>>>
>>>
>>>
>>>
>>>      On 06/05/2016 06:45 AM, Dimitris Kontokostas wrote:
>>>      > I was planning to ask for clarifications on this as well
>>>      >
>>>      > Is this thread about enabling all contexts in all SHACL Core
>>> components only
>>>      > or a suggestion to change the SPARQL extension mechanism in general?
>>>      > These two can be independent of each other imho.
>>>      >
>>>      > Best,
>>>      > Dimitris
>>>      >
>>>      > On Sun, Jun 5, 2016 at 10:10 AM, Holger Knublauch
>>>      <holger@topquadrant.com <mailto:holger@topquadrant.com>
>>>      > <mailto:holger@topquadrant.com <mailto:holger@topquadrant.com>>> wrote:
>>>      >
>>>      >     Peter, could you clarify whether you were only talking about the
>>> core
>>>      >     constraint components and how the spec would define them, or
>>> about the
>>>      >     general mechanism? I am not too concerned about how we write things
>>>      in the
>>>      >     spec. There is only one SPARQL query per component right now in the
>>>      spec.
>>>      >
>>>      >     Thanks
>>>      >     Holger
>>>      >
>>>      >     Sent from my iPad
>>>      >
>>>
>>>
>>>
>>>
>>> -- 
>>> Dimitris Kontokostas
>>> Department of Computer Science, University of Leipzig & DBpedia Association
>>> Projects: http://dbpedia.org, http://rdfunit.aksw.org,
>>> http://aligned-project.eu
>>> Homepage: http://aksw.org/DimitrisKontokostas
>>> Research Group: AKSW/KILT http://aksw.org/Groups/KILT
>>>
> 
>
Received on Monday, 6 June 2016 14:55:00 UTC