W3C

RDF Data Shapes Working Group Teleconference

17 Dec 2015

Agenda

See also: IRC log

Attendees

Present
hknublau, Arnaud, Dimitris, aryman, simonstey, ericP, iovka, labra, pfps, kcoyle
Regrets
Chair
Arnaud
Scribe
simonstey, aryman, hknublau, kcoyle

Contents


<Arnaud> the teleconference bridge is now open

<simonstey> hm.. new access code?

<Arnaud> https://www.w3.org/2014/data-shapes/wiki/F2F5#Remote_Participation

<simonstey> now I hear some music playing

<simonstey> nice though ;)

<Arnaud> hmm, you must have dialed the wrong number

<simonstey> ah now

<ericP> the dial-in codes are short; it may be a dense matrix

<simonstey> I guess I typed it in too quickly

<simonstey> scribe: simonstey

Arnaud: 2 main agenda items, ShEx & testsuite

SHACL & ShEx

Arnaud: when it comes to shex, you all know what shex is about
... we started the group with 3 proposals; we decided to go with holger's approach but wanted to use shex as our high-level language
... shex is continuously progressing; shex guys feel not be very welcome in the group and even might withdraw from the group
... I wish we would have more collaboration on this
... the idea to today is to get more visibility on what's currently going on with shex
... ericP will tell us whether we could still use shex; although we certainly would have to find compromises on both sides
... whenever semantics dont match, we have to decide which approach we should follow up on
... I wanted to have more discussion on how shex relates to shacl
... i.e. if and how we can use shex on top of shacl
... if it turns out that there is no common ground between shex&shacl, then that's still useful info

aryman: I would like to have eric to pick an example using some actual data and show constraints that can be expressed in shex but not (or at least not very pretty) in shacl

Arnaud: I wanted to have a list of things that are actually blocking the use of shex on top of shacl

<ericP> http://www.w3.org/2015/Talks/1217-shexshacl-egp/

Arnaud: I hope ericP's presentation can help us identifying those gaps

ShExC grammar: http://www.w3.org/2005/01/yacker/uploads/ShEx2?lang=perl&markup=html#productions ShExJ (JSON) grammar: http://shex.io/primer/ShExJ#schema tests: https://github.com/shexSpec/shexTest/tree/master/validation

ericP: goal is to allow shacl to use shex as high-level syntax iff (almost) entire semantics are shared
... if we only share parts of the semantics it's hard to argue any other use than transforming back and forth
... big differences are: in shex you have the shape and shape expressions; in shacl you e.g. have ORs of entire shapes instead of shape expressions
... shex defines shapes to be different from the actual shape expressions
... value expressions are also handled differently
... which all relates to what your grammar allows you to nest in what
... shex also supports stems
... the biggest difference is in the included middle constraints though
... [2nd bullet point slide 5]

<Arnaud> http://shex.io/primer/#issue-1

<pfps> included middle seems a strange way to describe this feature of ShEx

ericP: a tester has some particular constraints, and a programmer has some constraints
... we could write those constraints down by using QCR

aryman: what are the exact semantics here?

[in the example]

ericP: one of each
... well we could write that down in shacl (although quite verbose)
... but there are differences when someone is both a programmer and a tester
... in shex it passes if someone is a tester and someone is a tester and a programmer
... in shacl it raises an cardinality constraint violation
... because there is one programmer but two tester

Arnaud: 1) question: there is one person who has two roles
... 2) question: there are two persons one of which has both roles

ericP:

[discussing example given in http://shex.io/primer/#issue-1 ]

ericP: we are not actually married to this semantics, but we can never know if such an example will show up (or not)

iovka: one can think of cases where you need the shex way of handling this and one might think of ways where shacl semantics are preferable
... static analysis that two shapes are disjoint is very difficult
... with simple computations you can find the stuff that's in the middle

ericP: we could e.g. say that if behavior differs we could define it as being undefined and implementers could decide which approach they want to follow (shex or shacl)

<pfps> is the "default" interpretation of ShEx shapes completely open ??

Arnaud: I think there is nothing that prevents one being programmer and tester at the same time, when looking at the example of issue 1

kcoyle: I dont really think that this is actually a deciding factor; what we are talking about now is whether we can directly translate shex to shacl
... I'm also not sure whether there is enough justifaction to use shex as high-level syntax of shacl anyway
... this particular decision should not be made by us but by the actual users

<pfps> I hadn't heard anything about ordering in ShEx before either

kcoyle: for example I just recently discovered that shex actually supports ordering
... we never had an actual high-level presentation of shex; just e.g. discussions on additive vs non-additive

aryman: [bringing up his partition strategy]

ericP: I wanted to have this on the top level; i.e. this being the default behavior
... however we are not married with the partition approach

aryman: if we would adopt the sh:partition operator in shacl we would bring shex&shacl semantics closer together

ericP: yes my goal was to enumerate those points
... the last one is filtershape
... in shex we basically say: there is a grammar for traversing the graph
... unfortunately, filtershape does not directly fit into the shex philosophy of checking constraints

you can have filtershapes also within constraints

rather than shapes

hknublau: a filtershape is narrowing the scope
... but there is an open issue for that

[discussing semantics of filtershape]

<Dimitris> http://www.w3.org/2014/data-shapes/track/issues/49

<pfps> a big problem I have throughout the ShEx discussion of bugs is that my assumptions about bug reporting are violated throughout. A bug that has been reported multiple times is just as good a bug as a bug that has been reported only once. However most of the ShEx shapes do not allow an arbitrary number of reportings. In my view a better example is needed for ShEx

hknublau: in fact, scoping and filtering work differently from an implementation perspective

<pfps> the other problem is that it seems strange to use just one property for both reporting and reproducing

ericP: the other use of filter shape (example 8) is validating an optional property, i.e. a property doesnt have to be there but if it is it has to conform to some shape
... I'm wondering if there are any other examples of filtershapes that could not be translated to shex

hknublau: that's not quite right.. example 8 doesnt have an optional property

aryman: it's actually an if .. then ..

[more discussion on semantics of example 8 in http://w3c.github.io/data-shapes/shacl/#filterShape]

ericP: the question is, if we have a translation back and forth between shex&shacl, can shex serve as an high-level syntax for shacl or not?

<pfps> the question in my mind is just whether ShExC has syntactic constructs for most or all SHACL syntactic constructs, ignoring any semantic issues

ericP: I dont have any confidence that a construct e.g. using a disjunction in shex can actually be used to express e.g. example 8

aryman: I would already be happy if we can find a translation of shex to shacl but not necessarily the other way round

Arnaud: I think the shex people would not be very happy to have their syntax hijacked with a different semantics

pfps: shex has been proposed as input to the wg so the wg could do what ever it wants with that

Arnaud: technically true, but from a diplomatic point of view that may not be a very clever decision
... are you willing to give up your semantics for the sake of shacl using the shex guysr syntax?

ericP: if you say "can we have the shex syntax and commas are ands" then the answer is no

aryman: shex has its own syntax and rdf has its own syntax

<pfps> the funny thing about this is that the additive semantics was not part of ShEx initially

aryman: arguably writing rdf isnt very userfriendly
... it seems to me that as long as we can translate shex into shacl we are fine

<pfps> if the major construct is different between ShExC and SHACL-in-RDF then I don't see any benefit

aryman: I don't think it's a conflict; only things that are not translateable or just in a very weirdly manner than we need to close that gap

pfps: aryman suggestion is a non starter; suppose we have ShExC having the most prominent combining constructor being additive
... and shacl in rdf, where that's conjunctive
... as a user I would be very confused

iovka: I dont see this being that problematic

aryman: what peter says is true, however the main conflict occurs if we want to express more than one constraint on one property (and there is an included middle)

<pfps> ShEx and SHACL diverge even when there are no repeated properties - there are different stances on openness

<ericP> true

<iovka> (i have to leave in 40 min)

<aryman> scribe: aryman

ShEx and SHACL Relation Continued

Arnaud: Let's be less divisive
... Let's be more collaborative

kcoyle: Rather than unify ShEx and SHACL, which have some major mismatches, should we simply define our own compact syntax?

Arnaud: It is desirable to unify the community. Failing that we can define a compact syntax for SHACL.

<pfps> my argument was supposed to be along the lines that karen made - that the major differences mean that using ShExC with ShEx semantics as the user-friendly syntax for SHACL creates something that doesn't work

iovka: From a technical point of view we can translate many ShEx documents to SHACL. If we add a sh:partition operator to SHACL then we can translate more. We may also be able to statically detect when we can't translate.

Arnaud: The ShEx community also brings valuable use cases from Mayo clinic. See ISSUE-92.

ericP: The ShEx community can work from a distance on the translation.

Arnaud: the danger is lack of visibility to the WG

kcoyle: Are still working on SHACL or is it ShEx? The work must be integrated into the WG specs.

Arnaud: We should have a document that specifies the compact syntax and the mapping to RDF SHACL.

<simonstey> deliverable: OPTIONAL - Compact, human-readable, non-RDF syntax for expressing constraints on RDF graph patterns (aka shapes), suitable for the use cases determined by the group.

kcoyle: Disagree that we should have two syntaxes. Why do we need two syntaxes?

Arnaud: The charter specifies both syntaxes.

<iovka> aryman: there are technical reasons why we could want both rdf based syntax and compact syntax

<iovka> aryman: for instance, rdf has different syntaxes

<iovka> aryman: the constraint language is not bound to the syntax(es)

kcoyle: Do they have to be exactly the same (in design philosophy)? Otherwise this would confuse users who moved between the two.
... Is SHACL unchangeable?

Arnaud: No. We should entertain any proposal that eliminates the differences.

kcoyle: We should look at which approach reflects the majority of use cases and move to that.

pfps: Adding an additive construct to SHACL still does not unify the languages

Arnaud: Let's list the main differences

<pfps> Another difference is the different kinds of closure and openness.

1) Additive vs Conjunctive

2) Open vs Closed

<pfps> I don't understand what aspect of ShEx is ordered

<kcoyle> Peter, I don't either, which is also why it surprised me

Arnaud: What motivated ShEx to be additive

ericP: originally ShEx used Resource Shape semantics (each property mentioned once). Then work with clinical data showed multiple uses of the same property, e.g. for different types of observations.

<pfps> The medical examples appear to me to all have no overlap between the shapes

Arnaud: Aren't these important use cases?

<hknublau> There is always SPARQL as a fallback.

pfps: In clinical data, is the problem caused by poor RDF design?

ericP: For example, blood pressure has two measurements (systolic and diastolic) which are distinguished by codes.
... SHACL can express this using qualified constraints but it is very awkward.

pfps: There are less awkward ways to say that in OWL. The Issue example is somewhat contrived.

ericP: We are not married to the current partition semantics but haven't found a better way.

iovka: In the blood pressure example but ShEx and SHACL have the same problem.

Arnaud: Are we victims of inertia and history?

pfps: The problem with the partitioning semantics is implementing partitioning

<pfps> I'm not saying that partitioning is impossible, just that it is not easy

iovka: Partitioning is only problematic with repeated properties. We have defined a "lookahead" algorithm that improves the performance of partitioning.

pfps: Agree that the problem only occurs with repeated properties. The worst case is tough, but in practice we often do not get the worst case.
... Still a major problem because we have little implementation experience.

iovka: Agree that implementing partitioning is more difficult. We do have some experience and we can help programmers do this better.

<pfps> I think that Arthur's proposal for partition does not have the same implementation complexity

Arnaud: Should additive be the default for repeated properties in SHACL.

<iovka> aryman: it seems that pfps's main concern is complexity, this is also mine

<iovka> aryman: maybe you can reduce this in practice, but the proposal I made allows to test them in order in order and reduces the complexity

ericP: I am fine with that proposal ...
... the proposal would be to put terms use lists for ordering wherever we need it, and use the greedy algorithm for evaluation

<ericP> <S> a sh:Shape ; sh:expression ( [ a sh:PropertyConstraint .. ] [ ... ] )

<pfps> Another option would be to have an exclusive partiion construct that is not ordered but fails if any value falls in more than one of the partitions.

discussion about implicit negation

iovka: can we proceed based on this proposal to add a partitioning construct to SHACL?

Arnaud: good discussion, let's keep it open, encouraging, use mailing list to continue it

<ericP> +1

refer to ISSUE-92 in emails on this topic

+1

<iovka> +1

aryman: we should also add stemming to SHACL

<iovka> (have to leave, good morning/aftenoon/night to all)

bye iovka

<Arnaud> break 30mn

<hknublau> scribe: hknublau

Test Suite

Arnaud: What is the status of converting/reusing the ShEx test cases?

ericP: I am encouraged by the discussion we just had
... took care of 2 issues, including additive semantics, we have a couple of hundreds test cases
... other tests are about syntax conversions (JSON etc)
... 200 validation tests, 800 transformation tests
... we still need to wait for the outcome of the alignment discussion

<pfps> as the major combining constructs in ShEx and SHACL are different, checking is needed to determine whether tests for one are appllicable to the other

ericP: (discussion of some details on how to translate tests)...
... requires more investigation on how to proceed, but worth trying at least for a subset of the tests

Arnaud: I find it unfortunate that we have stalled on the test suite. Should not be limited to what ShEx people can provide.
... Discussion and resolution of ISSUEs should lead to test cases.

ericP: Fastest path is to work on the ShEx integration, then we can look at Jose's automatic translator

kcoyle: What format should the tests be?
... someone in our group created a SHACL file (validated using TopBraid)

<ericP> https://github.com/shexSpec/shexTest/blob/master/validation/manifest.ttl

<ericP> https://github.com/shexSpec/shexTest/blob/master/validation/manifest.jsonld

hknublau: I do have some additional tests but not nearly enough, and they are in a slightly different format

Arnaud: is there a need to simplify the test case file format?

ericP: JSON-LD has some advantages, better tool reuse

hknublau: I am open to using both Turtle or JSON-LD, but they should be in RDF, not compact syntax

ericP: The existing format has some tool support from Greg Kellogg

hknublau: I'd be happy to contribute more tests once I have a breather. But too busy right now.
... we have tool support in TopBraid now to create them easily, so I expect progress when I have time

ericP: Will help Karen's colleague with the format.

Arnaud: meta-comment that Face to Face meetings prevent dropping out

ISSUE-23

hknublau: Summary of current situation with ISSUE-23
... parallels with ShEx vs SHACL discussion

Arnaud: Holger dropped his sh:ShapeClass proposal, and presented a softer proposal
... I presented this to the group, I sensed philosophical difference, I thought there was not much point in continuing
... this forced a certain viewpoint on everyone
... we had 2 extremes, and I thought we shouldn't have a religious debate
... allow both views. Consequences are well-understood. We could highlight caveats in the spec.
... vast majority was of opinion to forcing the separated worldview
... Ted seems to be agreeing with Holger, but he is not present on the call
... meanwhile Dimitris proposal came along, not fully understood yet.

<Arnaud> https://lists.w3.org/Archives/Public/public-data-shapes-wg/2015Dec/0032.html

<pfps> in my view this boils down to the stance on whether ex:node rdf:type ex:Shape is to be supported in SHACL

pfps: Dimitris proposal is even worse than the previous one.
... it has an explicit construct for the design pattern permitting a node is an instance of a shape

<pfps> in the folllowing data graph ex:node rdf:type ex:Shape . ex:Shape rdf:type sh:ShapeClass. the constraints on sh:Shape will be applied to ex:node

<pfps> n the folllowing data graph ex:node rdf:type ex:Shape . ex:Shape rdf:type sh:ClassShape. the constraints on sh:Shape will be applied to ex:node

<pfps> In the folllowing data graph ex:node rdf:type ex:Shape . ex:Shape rdf:type sh:ClassShape. the constraints on ex:Shape will be applied to ex:node

hknublau: sh:ClassShape is just syntactic sugar abbreviating one triple

Dimitris: My approach drops the metaclass.

<pfps> The only difference at all between Dimitri's proposal and the current editors' draft is that sh:ShapeClass has changed to sh:ClassShape

hknublau: I am happy to downplay this, drop any examples etc in the spec

Dimitris: Now the ontologies are completely separated from the shape, and this gives users a way to reuse the same URI for classes
... and it drops metaclasses, impacting inferencing and other complications.
... what I like about this is that it makes it clear that this is about the class

pfps: the only change I see from before is that it saves the reflexive sh:scopeClass triple
... 3 approaches:
... a) Classes and shapes must be disjoint, doing this is illegal
... b) Special kind of class that is also a shape that triggers validation
... c) Be agnostic, we don't say that shapes can be classes
... d) There could also be a design structure that makes something both a shape and a class then this means validation

<pfps> The fourth approach would be to have nodes that are in the data graph and that are also shapes in the shapes graph trigger a kind of nodeshape validation

hknublau: I believe a main issue remains the difference between modeling (in Peters terms: real world entities) and data modeling/constraining.

<Arnaud> foaf:Person a rdfs:Class. x:PersonShape a sh:Shape; sh:scopeClass foaf:Person. ex:Alice a foaf:Person .

<Arnaud> foaf:Person a rdfs:Class, sh:Shape; sh:scopeClass foaf:Person. ex:Alice a foaf:Person .

<Dimitris> foaf:Person a rdfs:Class, sh:ClassShape. ex:Alice a foaf:Person .

hknublau: Important that the triples above are typically split across two graphs

<pfps> I sure hope that I am not just an record in a document on the web

<kcoyle> at this moment, i'd rather be that

<pfps> the ShEx workarounds require use of SPARQL, which appear to me to be quite complex

<pfps> the workaround here involves a single triple, which appears to me to be very much simpler

that's your personal opinion, but based on our experience this won't fly at all and will sink the whole approach.

you can have your personal opinion, but I want to ask the market to decide, not us.

We are in the end just a handful of people.

(Sorry I could not write the remaining discussion of this session. It was largely about discussing whether this triple shortcut is relevant or not).

My honest impression right now is that people are intentionally preventing this trial on the market, for no technical reason.

<Arnaud> you're free to try it on the market with your products

<Arnaud> that's what companies do all the time

We already have that situation with SPIN, but people want W3C standards, and furthermore we need database vendors to support this too.

Exactly the same applies to ShEx. We could ignore them too.

<Arnaud> almost

<kcoyle> "7 participants on the call" -- haven't heard from most of us?

<Labra> * I'm here but I will have to leave in half an hour

<pfps> abstract syntax?

<kcoyle> scribe: kcoyle

<Labra> * I would like if someone could write what Peter said

<Labra> * It was surprising for me

talking about question of revisiting decision to use shacl as basis

pfps: main disagreement is over developing of a modeling language

<pfps> when the decision between SPIN and ShEx was made it was easy

no meeting next week

<pfps> a decision now would be much harder, because ShEx has improved

<pfps> but that's not a good reason to revisit the decision

nor the next (the 31st); next telecon will be on the 7th of January 2016

<pfps> a better reason is that SHACL may be changing

Arnaud: have we covered all of the topics on our list for this meeting? seems so

<pfps> a major aspect of this change to me is whether SHACL is a modelling language (modelling in the sense of logic, not in the sense of data schemas)

<pfps> going with ShEx would present a much clearer stance on this issue, which is very attractive to me

Arnaud: can we cover abstract syntax without Arthur?

ericP: offering an abstract syntax from shex

Labra: also volunteered work on translation from shex to shacl will require abstract syntax and he volunteers to work on this

ericP: our tests already map to abs.syn

Arnaud: encourages eric and jose to take this on, since arthur has a lot on his plate
... we'llw ait for first draft from eric and jose

ericP: hopes to get a few minutes of arthur's time at the beginning

<Arnaud> https://www.w3.org/2014/data-shapes/track/issues/open

ISSUE: 108

<trackbot> Created ISSUE-116 - 108. Please complete additional details at <http://www.w3.org/2014/data-shapes/track/issues/116/edit>.

oops

ISSUE-108

<trackbot> issue-108 -- Should operations be specified? -- open

<trackbot> http://www.w3.org/2014/data-shapes/track/issues/108

hknublau: can be dropped, if we decide what to do with value function

pfps: they were supposed to be optional
... something needs to say what kicks off validation (interface)
... spec could say: validation happens when you are given two graphs; everything else is optional; just one required interface

hknublau: looking at 10.3; clarifying role of filterShape. onlyvalidated if matches filter shape

<pfps> I am against having aspects of the spec only show up in java code

operation would be a formal answer to those questions

pfps: proposes: get rid of section 10; say "the interface is the graph interface' and make sure that rest of spec nails down what happens when you kick off validation
... given a shapes graph and a data graph

<Dimitris> what about instance validation?

hknublau: ok to close

<Dimitris> I am also fine to close it as suggested

<pfps> PROPOSAL: The one specified way to invoke SHACL validation is an interface that takes a data graph and a shapes graph.

+1

<Dimitris> +1

<pfps> +1

<hknublau> +1

(I didn't get that - Dimitris pls scribe your statement)

pfps: you can be conformant if the only interface is "give me two graphs"

<Dimitris> asked if validating an instance against a shape is supported by the porposal

Arnaud: You can always do more; this is the minimum

<Zakim> ericP, you wanted to ask if the manifest format makes a tester compliant

ericP: manifest has a schema doc and a data doc, a schema node and a focus node... implies an API that takes four arguments...
... manifest is a control structure

pfps: extra arguments are how you add control to shapes graph. testing environ is to set up shape and data graphs
... or have control to start validation in a different place
... just want something written so that we can define conformant

ericP: in sparql working group never specified how most queries are executed; assumed people would do the right thing

pfps: that's a pain point
... maybe there should be another interface that has an implicit data graph

<ericP> +0

ericP: not sure this is the best way, but a stake in the ground

RESOLUTION: Close ISSUE-108, the one specified way to invoke SHACL validation is an interface that takes a data graph and a shapes graph.

<Dimitris> issue 80 should be easy

ISSUE-80

<trackbot> issue-80 -- Constraint to limit IRIs against scheme/namespace, possibly with dereferencing -- open

<trackbot> http://www.w3.org/2014/data-shapes/track/issues/80

<pfps> ok by me

<Dimitris> ok by me too (Excluding the dereferencing)

pfps: doesn't like "valueScheme" because of "scheme"

ericP: this is stemming in shex

"valueStem" is consistent with SHACL

hknublau: isn't this already done with regex? diff here is checking to see if resource exists

ericP: semantics can't be matched to long list of 'or's
... each stem is disunct -- specialization of regex

<pfps> this appears to be a conjunction of IRI & pattern

kcoyle: two things - pattern matching and external resolution

ericP: many ways folks want to deal with value sets, from enumeration to provenance to dereference and diff kinds of dereference
... e.g. is subclass of X
... problem is that it's time dependent

Arnaud: this adds something we haven't had before now

ericP: not a version one feature

Arnaud: does sh:Pattern address this?
... having stem is syntactic sugar

<ericP> kcoyle: thinking about how [bark bark] you can tests to see [bark bark] that you have a DOI

<Arnaud> PROPOSED: Close ISSUE-80, accepting the proposed addition of sh:valueScheme renamed as sh:valueStem, and leaving resource resolution/fetching to a future version of SHACL

<ericP> +1

<Dimitris> +1

<pfps> +0.5

hknublau: what are we gaining here? this saves one character?

ericP: user-friendliness. works when you have a value list
... stems different from uri's.

<ericP> [ a sh:PropertyConstraint ; sh:values ( [ a sh:Stem ; sh:stem foaf: ] [ a sh:Stem ; sh:stem dc:cre ] ) ]

hknublau: that's an or? need to make this clear; maybe not enough info to close this issue?

ericP: allowed values are a sequence of terms (?), so effectively an or

Arnaud: eric, put proposal in an email so we have a clear description

ericP: discuss resolving resources

karen will talk to holger about using named graphs to resolve this

closing the call

<hknublau> Next meeting hopefully in Nuuk time zone.

next call January 7

<Arnaud> trackbot, end meeting

Summary of Action Items

Summary of Resolutions

  1. Close ISSUE-108, the one specified way to invoke SHACL validation is an interface that takes a data graph and a shapes graph.
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.143 (CVS log)
$Date: 2016/01/07 20:38:09 $