RDF Data Shapes Working Group Teleconference -- 19 Feb 2015

<pfps> scribe: pfps

Agenda

Arnaud: next F2F, remaining requirements
... also high-level issues

Next F2F

Arnaud: F2F meetings are important because of the concentration of effort

<SimonSteyskal> WU Vienna would be happy to host the next F2F. There is the SEMANTiCS Conference at WU on September 15th-17th so we could maybe schedule it around that date..

<SimonSteyskal> (or earlier)

Arnaud: three months from now is May
... there is a W3C meeting in Paris in May, ESWC in Slovenia in June, WWW in Florence in May
... having a host helps a lot - it's not much money to be a host
... it is not necessary to be co-located with something, but it can help

iovka: Lille can probably host

<SimonSteyskal> +q

Arnaud: we don't need to confirm just now

Simon: Vienna can host, just about any time

Arnaud: is anyone going to any of the mentined events

I may be going to ESWC and/or DL, which is just after ESWC

Arnaud: Americans don't want to go to Europe
... organizations don't want to pay

<Dimitris> *I 'll be in a project meeting at 25/5-29/5 *

<SimonSteyskal> Participation See list of current participants, (or with contact info), wiki user pages, nicknames.

<SimonSteyskal> s/Participation See list of current participants, (or with contact info), wiki user pages, nicknames. //

I think that Nuance could host in Montreal or Waterloo just about any time over the summer

Arthur: IBM could host in Toronto

cygri: I don't know whether TQ can host in Raleigh

Arnaud: F2F in May in NA?
... A European meeting could be the next one (September?)

<Dimitris> *I will not be available in the last week of May*

Arnaud: May 19-21 in Toronto?

Ted can't make that week or early the next week

Arnaud: May 27-29 in NA?

<cygri> ACTION: cygri to check TQ F2F availability on May 19-21 or 27-29 [recorded in http://www.w3.org/2015/02/19-shapes-minutes.html#action01]

<trackbot> Created ACTION-13 - Check tq f2f availability on may 19-21 or 27-29 [on Richard Cyganiak - due 2015-02-26].

Arnaud: participants should determine their availability these two weeks, potential hosts should determine whether they can host

Requirements

Arnaud: I worked on the requirements page to make it match the discussion of two days ago
... Property Datatype?

pfps: still waiting for semantic media wiki

karen: what's the change

<Dimitris> *make a sub-heading*

pfps: it's whether int is an integer or not

<scribe> ACTION: pfps fix up Property Datatype in a way that doesn't confuse the wiki [recorded in http://www.w3.org/2015/02/19-shapes-minutes.html#action02]

<trackbot> Created ACTION-14 - Fix up property datatype in a way that doesn't confuse the wiki [on Peter Patel-Schneider - due 2015-02-26].

<Arnaud> https://www.w3.org/2014/data-shapes/wiki/Requirements#Datatype_Property_Facets

Arnaud: Datatype Property Facets

Arthur: facets are things like the maximum length of a string

Arnaud: there is little support for this in the wiki

<Dimitris> I should have voted as well

<SimonSteyskal> isn't this just defining a shape for this property?

pfps: voting on the spot makes it hard to respond

eric: my objection is that I want to consider different facets separately
... if this is just the notion of facets, then sure

harold: it this strictly for forms?

arthur: no, these are validation-related

arnaud: If people want this they should vote or make additions

labra: how about the XML Schema facets?

<cygri> all mentioned examples can easily be written as a SPARQL expression (like used in FILTER)

pfps: OWL has vocabulary for many of the XML Schema facets

richard: all the mentioned ones are SPARQL filters

<ArthurRyman> the OSLC submission contains a proposal for facets: http://www.w3.org/Submission/2014/SUBM-shapes-20140211/#datatype-facets

<SimonSteyskal> we have quite a few use cases which require this feature so..

Arnaud: we should have been following the approved process, i.e., only look at ones that are under consideration

<Arnaud> https://www.w3.org/2014/data-shapes/wiki/Requirements#Expressivity:_Aggregations

<cygri> I believe all XSD facets except whitespace can be handled by vanilla SPARQL expressions.

Arnaud: Expressivity: Aggregations

<Dimitris> *ericP, jimkont*

this has 2.5 votes

<cygri> I like it

Arnaud: there are objections - does anyone arguing for it?

Richard: aggregations can be useful, but there is the consideration of expense

<Zakim> ericP, you wanted to ask if our requirements are for the core vocabulary or extensions

Eric: some of these can be solved as SPARQL so are these things that have to be done outside of directly using SPARQL?
... let's also vote whether we need vocabulary for these

Richard: I'm not sure that we have resolved that SPARQL is not needed

<labra> +q

Richard: i.e., is there a profile that does not include SPARQL?
... the core vocabulary is only for convenience

pfps: Is SPARQL the extension language?

<cygri> I’d like SPARQL to be *the* language, and then there’s syntactic sugar.

<ericP> pfps: i haven't heard that the division is on the core vocab vs. SPARQL extensions

<ericP> ... lots of talk about SPARQL being necessary, good, or not evil

<ericP> ... but nothing saying that there's a core and SPARQL which is the extension

Arnaud: there has been as of yet no division into a core language and SPARQL as the extension language

jose: we shouldn't mix the high-level language and SPARQL

<Zakim> ericP, you wanted to say that i'd still like to vote on what's in the core vocabulary

eric: we are defining a vocabulary and functionality of it - is voting for minlength voting for this vocabulary and functionality in the core?

richard: eric moved SPARQL stuff out of the primer which makes me think that the ability to do arbitrary SPARQL queries is not in the core language

<Arnaud> http://www.w3.org/2014/data-shapes/charter#deliverables

richard: taking SPARQL out of the core changes a lot of thinking about expressivity

<ericP> a litmus for whether SPARQL is part of the core is whether someone can have a conformant implementation without a SPARQL engine

Arnaud: the charter calls out some expressivity plus some extensibility
... that is the minimum, we could do more

eric: are we clear on whether voting is for inclusion in the core, without this I'm unclear as to how to vote

richard: in LDOM there are templates, so all the core stuff can be a macro library

eric: does it matter to you what is the core language?

richard: not much, because I can just write a macro

Arnaud: is there a requirement for an extension mechanism?
... what would an extension mechanism be? should we just have a procedural call-out? should we specify a particular mechanism (like SPARQL)?
... in HTML there is a standard mechanism to attach a style sheet, but it doesn't have to be CSS; similarly for scripting, which doesn't have to be javascript

<hsolbrig> +q

harold: it may be hard to test such extension mechanisms

eric: can we state which requirements are in the core?

richard: there is a requirement for macros, but do macros look like the other parts of the language?
... if macros don't look like the core then it becomes more important to determine the boundaries of the core

eric: the namespace would be different

richard: is this the only difference

<Dimitris> *Thank you EricP*

richard: or is invoking macros syntactically difficult

eric: even if there is little difference the division is needed

richard: if the macro facility is powerful then the core can be empty
... it is more important to get the macro facility set up

<ericP> pfps: in latex, there's no significant difference between macros and core language facilities

pfps: reiterate richard's comment

<ericP> ... so if the macro facility is like that, then who cares?

iovka: I want to be able to perform analysis on the core language

<Arnaud> https://www.w3.org/2014/data-shapes/wiki/Requirements#Separation_of_structural_from_complex_constraints

Arnaud: there is supposed to be a separation between structural and complex constraints

<Arnaud> https://www.w3.org/2014/data-shapes/wiki/Requirements#Expressivity:_Closed_Shapes

Arnaud: Closed Shapes

eric: a use case is determining whether all triples in a graph can be moved over into something that can't handle arbitrary information

eric - can you add that to the requirement description?

eric: people also just want to say "that's all folks!"
... closed shapes can also mediate into applications

Arthur: one use case for shapes was to advertise what can be generated
... if there is other stuff then a server could throw away what doesn't match
... closed shapes appears to be different as it signals a violation

eric: but closed shapes is needed to make the determination

<ArthurRyman> an OSLC Shape serves to document the "known" or "expected" content - this is content that the service will do something useful with

richard: shape coverage is a related notion, and might be what eric needs

<ArthurRyman> an OSLC service should not reject requests that contain unknown content, but it should provide an informative message in the HTTP response

<labra> +q

ted: shapes are all about limiting what is said, so the general "say anything about anything" matra does not apply
... severs should feel free to reject stuff that they don't want

<ericP> pfps: my objection was the philosophical one

<ericP> ... but the other is that someone has to come up with a definition for this

pfps: someone has to come up with a definition for this

<ericP> ... i don't understand what it's supposed to be or do, so i can't spec this

jose: i have a definition in the glossary
... it should not be required for shapes to be closed

<TallTed> can links to glossary definitions be included in UC/Reqs/etc? we have tools that handle links. treating their content as literals is unfortunate.

<Zakim> ericP, you wanted to say that use cases for mixed closed/and open shapes appear to be rare

eric: use cases for mixed closed/and open shapes appear to be rare

jose's definition of a closed shape is something completely different from this requirement

<cygri> “closed shape =

jose's definition of a closed shape is one that lives happily with open shapes

iovka: initially all shapes were closed

iovka's definition of a closed shape is like jose's

iovka: closed shapes are about the neighbourhodd

pfps: there are two very different definitions of closed shapes

<iovka_> the paper I was talking about http://www.grappa.univ-lille3.fr/~staworko/papers/staworko-icdt15a.pdf

arthur: closed shapes considered harmful because services and clients change

<cygri> +1 to arthur

arthur: the open content model is a very desirable ideal
... there needs to be movement on this before it can be approved

richard: how can I put forward the notion of coverage?

<cygri> ok, thanks

Arthur: if there is a story to support it, then start with the story; else start with a story

jose: the portals user story might support coverage

<cygri> thanks, labra

Arnaud: 10 minute break

<labra> labra: What I said is that a user story could be the "validating and describing" linked data portals

<labra> labra: where you want to describe the contents of a data portal and you want to say that you only allow a fixed number of triples

<labra> labra: Previously, I had said that in my opinion, closed shapes should be a construct of the language, but they could be optional...by default we can have open shapes, and have some construct to define open shapes

<cygri> ok

<cygri> scribe: cygri

Requirements cont’d

<Arnaud> https://www.w3.org/2014/data-shapes/wiki/Requirements#Severity_Levels

Arnaud: Next: Severity levels

… Objections from ericP and labra

ericP: We can do without.

kcoyle: The intention is that one non-match could be a fatal error and another non-match could just be informational
... Matching is not just yes/no; I’d like to have different severities of no
... Something a server might just want to reply, if you changed this then I could do more with your data
... Anything that you’re testing for in your shape, you want to attach a level

Arnaud: The question is, what’s the granularity

ericP: So do you want to attach it to the property, or to a particular facet on that property such as cardinality or string length?

<pfps> If you care about the distinction you can just have several shapes.

kcoyle: We want to return a message with a failure. That’s the needed granularity.

ArthurRyman: Attach comments/messages/severity to constraints
... deprecation as example; tolerate people using it, but want to include a message that says, use this other one

pfps: If you want part of the shape to be warning and other very fatal, create two shapes, and tag one as this and the other as that

Arnaud: So that would mean, granularity is the shape

ArthurRyman: There can be cases where you have “growing” shapes, there’s something strictly required, then more stuff that’s just gloss

Arnaud: If you reject this requirement, you don’t have a way of doing it at all

ericP: Well we wouldn’t get interop on severity levels, it can still be done outside the spec

labra: My objection was to the way the requirement is written. It could be done as metainformation on the shape level.
... But the validator should still be yes or no
... I wouldn’t object to put this as metainformation on the shape, as structured information

ericP: I’d be happy with that. My concern is only that doing it at facet granularity would require reifying the facets.

labra: If there’s a way to add metainformation, anyone could define their levels

Dimitris: We do this in RDFUnit. We have severity levels on shape granularity and property granularity and you can select a level at execution time. Constraints at lower levels are skipped in validation.

ericP: In C terms, this is like IFDEF?

Dimitris: Yes.
... Property granularity is useful

ericP: I could withdraw my objection. Not sure yet what the impact on complexity is.

Arnaud: There are different possible approaches. Let’s not get hung up on these details yet.

labra: I don’t object then.

kcoyle: I’m rewording the requirement.

<kcoyle> The language should allow the creation of error responses that can include severity levels as desired

pfps: This wording pulls in the next requirement, about human-readable error messages

<Arnaud> PROPOSED: Change description to: The language should allow the creation of error responses that can include severity levels as desired

<pfps> +1

<iovka> +1

<ArthurRyman> +1

<SimonSteyskal> +1

<Dimitris> +1

<TallTed> +1

<labra> +0

<kcoyle> +1

<ericP> +0

<hsolbrig> +1

RESOLUTION: Change description to: The language should allow the creation of error responses that can include severity levels as desired

<ArthurRyman> +1

s/change description to:/change description of “Severity Levels” to:/

<Arnaud> PROPOSED: Approve 2.10.1 Severity Levels

RESOLUTION: Approve 2.10.1 Severity Levels (hearing no objection)

<Arnaud> https://www.w3.org/2014/data-shapes/wiki/Requirements#Human-readable_Violation_Messages

subtopic: Requirement “Human-readable Violation Messages”

Arnaud: Objections from labra and ericP

<kcoyle> +q

ericP: This comes down to granularity again. This requires ability to add custom error messages at every facet.
... So we’d need one text for violating minCardinality and another text for violating maxCardinality. This seems complex.
... And if we simplify it by only allowing it at the property label, then we could solve it just by giving a label/comment to the property.

<Dimitris> +q

ericP: To make this practical, you’d have to be able to provide custom error messages at the facet level of the property.

TallTed: If your shape is at the facet level, that’s where your message goes. If your shape is at the property level, that’s where your mesage goes.

ericP: What would be a useful error message at a level higher than facet?

kcoyle: This and the previous requirement go together. Indicate severity, and indicate message.
... We have to discuss granularity first.

Dimitris: In RDFUnit we already have this, with separate messages at the facet level.
... It was not hard to implement.

<ericP> cygri: for clarification, these error messages are generated by the system?

<ericP> ... in RDFUnit, the author can't customize the message, correct?

<ericP> Dimitris: correct

<ericP> cygri: kcoyle, are you content with system-generated?

<ericP> kcoyle: i think they have to be customized

<TallTed> anywhere I may want a human-readable error message, I should also be able to assign an error level... and vice versa.

<TallTed> both should support multiple langs.

<TallTed> some built-in tests (e.g., max length) might have default messages in system -- but person defining shape should be able to provide custom error therein.

Dimitris: If the types of constraints are hardcoded, why are customised messages needed?

Arnaud: In the case of SPIN and LDOM, you have variables and so on, so it’s easier to customise

ArthurRyman: In the proposals where atomic constraints are not addressable, it’s hard to attach custom messages, you’d have to reify
... But the system could generate good messages

labra: Separate concerns. Validator should generate a data structure. Turning that into something human-readable is a later step and out of scope.
... A report or data structure.

kcoyle: It needs to be specific to the test. Different shapes need different messages. As long as that can be done, it’s sufficient.

Arnaud: TallTed made the point that granularity for severity and messages should be the same
... It’s useful if the system doesn’t just tell you, “You failed”, but “You failed because you have ten characters instead of five”. So the message must be dynamically assembled at runtime. This is more complicated, variable substitution, etc.

kcoyle: We need to get an idea of the granularity issue.

ericP: What would be practical messages that we’d like to customize?

kcoyle: Gather examples of actual errors people would like to report.

<ericP> cygri: my impl of R2RML does complex validation of R2RML mappings

<ericP> ... it uses a lot of custom error messages

<ericP> ... i may be able to extract some examples of the sorts of error messages we'd like to automatically generate

<Dimitris> *cygri, can you try RDFUnit on an R2RML doc?, we did some further work there*

ericP: I’d like to see examples and that might change my vote.

<Dimitris> +q

labra: Split this into two requirements? 1. Data structure for violation reports, 2. Human-readable messages generated from them?

Arnaud: Let’s move on

<Arnaud> https://www.w3.org/2014/data-shapes/wiki/Requirements#Evaluating_Constraints_for_a_Single_Node_Only

Subtopic: Requirement “Evaluating Constraints for a Single Node Only”

kcoyle: I don’t understand why this is needed.

labra: The intention was that you select a single node to apply a shape to.

ericP: This is already done with oslc:instanceShape

Arnaud: We haven’t discussed the processing model much.

labra: We have more requirements in the “Selection of nodes” part
... This here is the same as “Selection by single node”

kcoyle: So drop this one here?

[discussion about what Holger might have meant]

labra: Can we approve 2.12.3 then?

<TallTed> TallTed: 2.12.x seem a better approach than 2.11.8, as written. 2.12.x may not be complete -- may not cover all necessary axes of selection -- yet.

kcoyle: So does this mean you don’t go beyond the first arc?

labra: Depends on how your shape is defined.

ericP: Might suck in the whole graph.

<ericP> cygri: when we talk about "selecting nodes", does that mean we want to trigger validation by something that's in the graph or that we want to trigger validation by an [external] API call?

<ericP> ... choice 1: we have a graph and the shapes processor looks at triples in that graph to invoke validation

<pfps> The control mechanisms have not been discussed very much at all, particularly the possibility of "in-band" control (i.e., the data graph itself includes validation control).

<ericP> ... choice 2: validation begins with a node supplied as an argument to the valildator

<ericP> ... does the author of the graph to say "i want shape x to apply to y" or is it that the person invoking validation says "i want to test node y againts shape x"?

<ericP> TallTed: i may have either

<ericP> cygri: instanceShape is a way to trigger within the graph

<ericP> ... but you say you'd like an external mechanism. yes?

<ericP> TallTed: if we understand each other?...

<ericP> pfps: are we going to allow in-band control or out-of-band control?

pfps: We’re exposing a fault line. Do we allow in-band control or out-of-band control?
... in-band control is you include stuff in the data graph to trigger validation
... out-of-band control means the validator has a separate source of control information, or it’s controlled through API

ArthurRyman: A service description document can mention a shape, and submitting something to the service triggers validation
... The other case is when you do a GET. The server may state with instanceShape that here’s a shape that, if you were to validate the response, it would validate against this shape.
... That’s how it works in OSLC, start with a single start node. You could do other ways like applying constraints to all nodes in a graph

Arnaud: We will have to get to the bottom of this, but not right now.

Adjourning

<iovka> -iovka

Arnaud: Thanks all, next meeting is next week
... We made good progress, no blood on the wall, that’s success!

pfps: No blood on the wall where you guys are!

<Arnaud> trackbot, end meeting

RDF Data Shapes Working Group Teleconference

19 Feb 2015

Attendees

Contents

Agenda

Next F2F

Requirements

Requirements cont’d

Adjourning

Summary of Action Items

Summary of Resolutions