W3C

RDF Data Shapes Working Group Teleconference

21 May 2015

Agenda

See also: IRC log

Attendees

Present
aryman, pfps, Labra, Dimitris, hsolbrig, iovka, kcoyle, simonstey, ericP, Arnaud
Regrets
Chair
Arnaud
Scribe
iovka, pfps

Contents


<pfps> I can scribe for session 2.

<kcoyle> http://w3c.github.io/data-shapes/data-shapes-ucr/

<iovka> (how do I say zakim I'm scribe ?)

<pfps> scribenick: iovka

User Stories

<simonstey> +q

Arnaud: the aim is discussing on the status of the document

<pfps> OK, having the log is the most important thing

Arnaud: publish the document as public note

simonstey: my concerns are that the current document is the user stories renamed as use cases
... there is no standard way to model user stories
...: for me, a user story is a an organization describing what they want to do
... use case is an agent with its role, and a particular think tha agent want to do (e.g. cardinality constraints)
... I want go through the current list and identify the user cases, and keep them separated from the user stories
... I have to admit that I didn't get to it for now
... maybe we fix a date for that

it would be ok if we clean the document like it is now, and maybe not necessary to separating use cases from user stories

Arnaud: should we publish the latest draft ? is it worth ?

kcoyle: few things added, but 95% the same

Arnaud and Kcoyle agree not worth re-publishing now

Arnaud: it used to be tedious process for publishing documents for the wg
... now it has become easy
... it's almost a one-click process now, everything is in the control of the editors; if validation errors, then it won't be published

<kcoyle> +q

Arnaud: we can set a deadline if simonstey wants one

kcoyle: there are user requiremenst that we never approved, now these are commented out, I think we should remove these
... use case 40 does not have associated requirement. should I raise this as an issue ?

Araud: yes, bring it as an issue

Arnaud: in the future, we can still add new use cases, and people vote for these

Arnaud and simonstey agree on end of june as deadline

Arnaud: if not finished by the deadline, it's not critical. having progress on that is already good

Test Suite

Arnaud: the most usual is to have a test harness, implementers run the tests and report on these, we collect the results and report to the managers
... there is no single framework to do it, different approaches are possible
... we need to figure out how to do it in this wg
... what is the format of the test suites ? of the tests ? what the tests look like so that people can write tests ? who manages the test suite ?
... when we publish a draft, we invite people to implement these, implementers run the tests and post the results
... somebody has to be in charge of collecting the results, monitoring the mail list
... there are tools helping for doing that
... the difficulty for now is that we do not have a syntax, so difficult to define tests
... for now it seems like we will have 2 syntaxes: rdf and shex-like compact syntax

<Labra> http://www.slideshare.net/jelabra/data-shapestestsuite

Labra: a presentation slides

<aryman> rejoined - pls paste the test suite web page link

<Dimitris> http://www.slideshare.net/jelabra/data-shapestestsuite

<aryman> thx

Labra: for the format of the test suite, I propose a main forlder, and subfolders for different cases

<pfps> I would very much prefer to have presentations to the working group available beforehand. This following along with new information is difficult and wasteful.

Arnaud and Jose discussing on slide 6: add a link to the section of the formal specification to which a test is related

<Labra> https://github.com/w3c/data-shapes/blob/gh-pages/data-shapes-test-suite/tests/example/manifest.ttl

<simonstey> if both representations are equivalent?

Jose and Arnaud discussing on p.7 : what is the aim of tests for the conversion from one schema format to another

Arnaud: how can we do the test ? is there a unique translation ?

aryman: take a lesson from ??? testsuite. the first thing is to decide which of the syntaxes is "default"
... the compact syntax might be evolving
... there are 2 kinds of tests: the one testing against data

<Dimitris> +q about well-formed shapes

aryman: the other is to check translation between the two syntaxes
... I would prefer to decouple the two

<ericP> in SPARQL, we originally had a turtle syntax for result sets, but it fell by the way side pretty quickly

Jose: compact syntax is simpler, but I agree we might have one default syntax

Arnaud: I do not necessarily agree with arthur's claim that rdf syntax should be default
... for instance sparql has 2 different test suites with different syntaxes

<ericP> iovka: checking against the data is different from checking against the syntax

<ericP> ... either there is a default syntax,

<ericP> ... .. or there is a provided conversion

<aryman> three kinds of test case (at least) 1) check that a shape is well-formed, 2) check that a data graph satisfies a shape, 3) check that conversion between rdf and compact syntax is correct

<ericP> Dimitris: in RDF we can write anything so i'm not sure how we can check the well-formedness

scribe doesn't hear well enough

<ericP> ... we can check, e.g. a bad SPARQL syntax, but other things are not easy to check

<ericP> ... we will need to do the graph isomorphism that Jose is doing to check the results

aryman: let me explain the default should be RDF
... the semantics of the compact is defined in terms of the rdf
... so the compact syntax might change

<Zakim> ericP, you wanted to talk about SPARQL test development

aryman: we do not want to change the test cases each time the compact syntax changes

ericP: in the sparql working group, the burden was that every implementer had to convert, or have systems that support parser for all the formats
... we realized we didn't have good idea of the coverage
... so we developped tools for measuring coverage
... it would be better to do this from the start
... also having description of what every test is about
... in sparql we had positive/negative syntax and semantics
... the negative examples were things beyound the grammar

Arnaud: did folders relate to sections of the spec ?

ericP: to features, combinations of operators
... I would expect directory for testing the facets, the basic node types,
... people were proposing tests, with an expected answer, other were verifying and the wg approved
... sometimes people sumbitted 2 possible answers, the wg approved the one or the other

<ericP> coverage report

<ericP> (turtle coverage report)

kcoyle: we should be testing whether our standard meets the requirements

Arnaud: absolutely, and also that might help to define the coverage

aryman: in ??? wg we ensured that for every term in the vocabulary, there is a test that tests that term
... we automatically generated the lists of terms, I think that this would be easier to do with rdf syntax than with compact syntax

<Zakim> ericP, you wanted to talk about test case analysis

Arnaud: there are different ways in measuring coverage

<ericP> http://dvcs.w3.org/hg/rdf/raw-file/6f51ac509ff3/rdf-turtle/coverage/report-atomic-tests.html

<aryman> OSLC defines vocabularies for Requirements Management and Quality Management

<aryman> Here is the Requirements Management Spec http://open-services.net/bin/view/Main/RmSpecificationV2

<aryman> Here is the OSLC Requirements Management vocabulary: http://open-services.net/ns/rm#

<ericP> Turtle implementation report

<aryman> Here is the OSLC Quality Management spec: http://open-services.net/bin/view/Main/QmSpecificationV2

Arnaud: there are nice tools that allow to produce reports where one can see which implementation covered what

<aryman> Here is the OSLC Quality Management vocabulary: http://open-services.net/ns/qm#

Arnaud: one of the questions we haven't discuss is about the process
... most practical would be to do something similar to what we did to user stories and reqirements

<ericP> SPARQL implementation report

Arnaud: people submit whatever they want, we validate case by case
... we want everybody to be able to submit, have it very open

Jose: it sounds reasonable to me
... can I start adding tests ?

Arnaud: no reason to wait, except that the specification is still in progress
... you have the burden to modify the tests whenever the specification changes

<pfps> +1 to Arnaud, the difficulty of changing tests that cover unapproved things cannot be used as a reason to prefer one solution over another

aryman: it would be valuable that the data is treated independently on the shapes
... people contributing with date, and with unformal description of what they want to test on these data sets

Arnaud: when testing that validation should fail, how do you catch the errors ? how do we test levels of severity ?

<Labra> https://github.com/w3c/data-shapes/tree/gh-pages/data-shapes-test-suite/contrib

Jose: I planned having a folder for not approved
... or with informal description of what we want to validate

kcoyle: there is a lot of test cases that I can produce and put in something like that
... most of these test something specific

Arnaud: it should be in the right format, so that it could be part of the test suite

Jose: this folder would be used to put data and informal requrimenst, and then people trying to model this and add these as fully specified test cases

pfps: bizarre talking about tests whereas we do not have an api
... for invoking shacl

Arnaud: with XMLschema they didn't define anything right this.

<Labra> This is XML Schema test suite http://www.w3.org/XML/2004/xml-schema-test-suite/

aryman: an api is an additional level of specification, we define the intended meaning of a shacl document using tests
... defining an api would require to define an additional language

<pfps> Well, I was referring to an abstract API. (Does an abstract API even make sense.)

Arnaud: to be able to run the tests, people would to need to develop a test harness that allows them to run the tests

<aryman> we could define a standard test result format

<pfps> Aside from not having an API, I don't think that we even have a good notion of what kinds of arguments are input to SHACL.

<pfps> scribenick: pfps

Issue 3 - Graph Shape Association

pfps: ISSUE-3 is how to relate shapes and graphs

<simonstey> https://www.w3.org/2014/data-shapes/wiki/ISSUE-3:_Graph_Shape_Association

<aryman> Is this the current W3C standard for test results?: http://www.w3.org/2001/03/earl/

pfps: several possible associations
... shape in data graph, explicit links (like owl:import), implicit links (LD follow your nose), no links
... in the first three there is only one input to validation, in the last there are two inputs

arnaud: SHACL could use several mechanisms
... which one to use has to be considered

pfps: two different kinds of invocations - one that is the no-linking one, and one that is one or more of the others
... I'm strongly in favour of the no-linking one

arnaud: XML schema did this and there was a lot of push-back
... to the effect that the data document should say which schema is in use

<aryman> OSLC defines explicit linking from data to shape via oslc:instanceShape

<Zakim> ericP, you wanted to ask how much <?schema=""> is used

I think that RDF is different from XML

ericp: how much is the XML linking mechanism actually used?

hsolbrig: we use XML schema linking a lot
... but RDF is different from XML
... a given chunk of RDF can be advantageously used with multiple shapes

aryman: the interface to services might want to have a link between the data and the shapes
... the link from a service is a claim that the document satisfies the shape information

scribe comment - the last scribing wasn't right

aryman: in a document produced by a service the link to a shape document is a claim that the data document satisfies the shape

kcoyle: one use - informational - the data-shape link is a claim of conformance
... another use - verification - run extra shapes over a data graph
... the first needs a link, the second can't have a link

<Dimitris> +q

arnaud: so the question is whether to define a data-shape linking mechanism

dimitris: this has all the drawbacks that were ascribed to using rdf:type to link resources to shapes

aryman: I don't think that the analogy goes through - node-shape links are different from graph-shape links

<aryman> A node (IRI) may appear in many different graphs with the same rdf:type, but with different shapes

arnaud: is this about using rdf:type??

<Dimitris> but the same resource may different types in different graphs

<ericP> pfps

<Dimitris> and as soon as the shape can be dereferences it will have the same implications as rdf:type

<ericP> pfps: i don't think so

<ericP> ... i was hesitant to have a link, but i see two good use cases:

<ericP> ... .. what control is or was applied to the document

no, this is not about the link

<Dimitris> and it is also information that would probably be included in closed shapes

<Zakim> ericP, you wanted to sa that it seems like the use cases for linking range from "here are some shapes i might conform to" to "thou shalt always conform to X"

<Dimitris> I also pointed is that I woulddn't object to this but woudn't favour either

ericp: the linkage is saying either "I'm X otherwise I'm broken" or "there is something that I might conform to"

aryman: there is no "might"
... but there is no obligation to actually go and check
... applications can advertise what kind of data they want
... the interface contract involves instanceshape

hsolbrig: is the claim coupled to the data or to the servce

aryman: on a get there is a document that points to the shapes

<Dimitris> +q

aryman: for puts the shape is in a separate description document (or http header)

dimitris: if I put shapes on DBpedia resources then these shapes cannot be overruled

aryman: the result of a service invocation includes a shape link that describes the resultant graph, that graph might be merged with other information, in which case the shape might be no longer correct

<Dimitris> +q

<ericP> http://w3c.github.io/data-shapes/semantics/#associations

arnaud: is there any proposal that includes graph-graph linking

<ericP> -http://w3c.github.io/data-shapes/semantics/#associations

<ericP> Associating Data with Shapes

dimitris: I don't have any graph-graph linking

arnaud: does anyone do an embedding solution

pfps: yes, SPIN and I think Holger's proposal
... if there graph-shape links then putting the graph in a different context could invalidate the constraints

<Zakim> ericP, you wanted to say that both Holger's and ShEx provide links

pfps: just because there is an instanceshape link doesn't mean that you have the embedding setup - you might ignore such links in the data graph

arnaud: the engine might only be sensitive to such links under certain circumstances

ericp: if the api is g1 is the data graph and g2 is the control graph then what happens if they are the same?

<Dimitris> embedding can be a special case of no linking where data & shapes graph are the same

aryman: a frequent case is that the graph is in a system that has multiple graphs where closed shapes might no longer be correct for the union of the graphs

+1 to dimitris

aryman: user story 40 - use case 43 - inline vs remote - ....

<ericP> Dimitris, why wouldn't the resource just point at itself with an instanceShape link?

<Dimitris> ericP, the shape or the resource?

<aryman> see user story 40, use case 43

<aryman> there is a property oslc:representation, http://www.w3.org/Submission/shapes/#representation

<Dimitris> ericP, as I said earlier, this has the same implications you had with the rdf:type predicate, it stays in the data and when the shape can be dereferenced is global. It also may create problem when merging data from different sources where you'd have to remove these triples

kcoyle: will a group of SHACL requirements be identified with an IRI

<aryman> it has values oslc:Inline, oslc:Reference, oslc:Either

<Dimitris> ericP. rdf:type has a different meaning (although people use it for validation) but sh:instanceShape will be much stronger than this

pfps: karen - you mean that SHACL requirements are collected into documents that are referenceable by IRI

kcoyle: more or less

pfps: yes

kcoyle: then you can decide to use different SHACL controls by pointing a different documents

pfps: yes

arnaud: there is still work to be done here
... are there any other possibilities?
... should any of these be eliminated?

arthur: it doesn't cover service factories

arnaud: this appears to be outside the scope of the working group

<Zakim> ericP, you wanted to say that the API validate X as Y invocation seems to sometimes be an internal call invoked by the linked use cases, as well as an extermal API that will be

ericp: LDP can build on what is done here
... internal vs external use??
... sometimes validation triggers off a link in the data and sometimes validation is done from the outside

<Dimitris> +q

aryman: the highest priority is to have an invocation where you have two arguments - control and data

arnaud: I don't expect the wg to provide an explicit API
... the question is whether there should be a way of indicating what validation is to be done based on data in the data graph that points to the control graph
... is there a standard way of linking from the data graph to shape graph
... even if the wg specifies a link it may not be automatically enforced

dimitris: is the triple counted?

I'm willing to produce a proposal, and add it to the wiki page

<Dimitris> talked about the possibility to reverse the relation i.e. ex:shapeA sh:hasInstance ex:resource

arnaud: the wiki page is a good resource
... if you have a use case, please indicate which kind of linkage it needs
... the no-linking version seems like a given
... if someone wants something else, please speak up

arthur: should we add requirements to the wiki

kcoyle: yes

arnaud: if you see a gap, add something
... if you feel a need, speak up
... close
... next meeting *not* today
... next meeting is next week
... next week will have a vote on the way forward, so be prepared

bye

<aryman> bye

<iovka> bye

<Labra> bye

<Arnaud> meeting adjourned

<Arnaud> trackbot, end meeting

<iovka> good bye everyone, I'm leaving now

<Dimitris> good buy all

Summary of Action Items

Summary of Resolutions

    [End of minutes]

    Minutes formatted by David Booth's scribe.perl version 1.143 (CVS log)
    $Date: 2015/05/28 21:03:29 $