SPARQL WG dedicated meeting on formal update model -- 30 Jul 2010

<LeeF> thanks, zakim

zkim, [IPCaller] is me

<LeeF> Scribenick: AndyS

LeeF: it's Friday ...
... intent: find the approach for a formal model of update
... goal - leave meeting with material for Paul and Alex (or other volunteer) to craft text

<LeeF> Lee: how does query's model work?

<LeeF> AndyS: you have a string in your hand - what is the right answer to come out?

<LeeF> ... 1. parse the syntax into an abstract syntax tree

<LeeF> ... (expands out syntax shortcuts, etc.)

<LeeF> ... 2. run a fixed algorithm to turn the abstract syntax into the algebra

<LeeF> ... 3. evaluate the algebra

<LeeF> ... somewhat repetitive - close relationhsip between the terms in the algebra and their evaluation

LeeF - approaches include model theory, procedural style in SPARQL, other

scribe: suggestion: take the procedural style for SPARQL Update.

<AxelPolleres> in principle, sounds fine, depends on how we define "procedural"

<AxelPolleres> +1 to update mapping from one set/state to another

Paul, update are a mapping from GS to GS, variations of this (GS-1,2,3)

Paul: update are a mapping from GS to GS, variations of this (GS-1,2,3)
... implementation follow the data structure style
... there is potential disconnect there

LeeF: experience, prefer functional style

Input - http://www.w3.org/2009/sparql/wiki/Lees_Update_Graph_Model

<AxelPolleres> have put some draft of what I think, which apart from the redefinition of Dataset should work in my mail http://lists.w3.org/Archives/Public/public-rdf-dawg/2010JulSep/0126.html I think Andy had some similar proposals in another mail... can't find it at the moment.

Input - http://lists.w3.org/Archives/Public/public-rdf-dawg/2010JulSep/0127.html

Paul: Mulgara is immutable structures for transactionality (single writer)

<AxelPolleres> Andy, I think you had something more concrete on the semantics of various update operations in some earlier mail, can't find it at the moment, maybe I misremember

LeeF: Spec helps for test cases - links formal defns to practical implementation

/em coudl yo ufind it? was it http://lists.w3.org/Archives/Public/public-rdf-dawg/2010JulSep/0125.html [scribing...]

LeeF: is an update operation from GS to GS?

<AxelPolleres> I thought "Graphstore" could also use the definition of dataset, meaning that a graphstore

<AxelPolleres> is defined by a sequence of datasets DS_0 to DS_n determined by a sequence of

<AxelPolleres> updates: Starting with DS_0 being the empty or initial dataset,

<AxelPolleres> the "current dataset" after the n-th update is simply DS_n.

Sandro: what about GS-state to GS-state?

<LeeF> service <-> graphstore <-> state1 -> state2 -> state3 etc.

Axel: key is define state DS(n-1) -> DS(n)

LeeF: what is the state of a GS? is it a DS?

<AxelPolleres> Axel: thinks that the state of a graph store should be representable as an RDF dataset

<LeeF> Is a GS-state the same as an RDF dataset? a pair of (graph, Set of Pairs of (Name, Graph))

<AlexPassant> "state of a GS is a DS at a particular time" might be better

<AxelPolleres> alex, works for me.

<AlexPassant> (graph, set of pairs, T)

<AlexPassant> so we have (graph, set of pairs, Ta) -> operation -> (graph, set of paris, Tb)

<AxelPolleres> I used T_n T_n+1 to indicate that I mean that to be atomic, i.e time here being meant discrete, nothing happening in between

LeeF: do we agree that the state of a GS is a DS or similar?

<pgearon> I'm happy with that definition

<AlexPassant> agreed with that AxelPolleres (re. atomicity)

<sandro> +1 this meeting is gathering advice for editors, not constraints for them.

LeeF: What is the concept of the time here?

Alex: discrete time, point-wise transitions to new state

<LeeF> GS - graph store

<LeeF> GSS - graph store state

<LeeF> DS - rdf dataset

<LeeF> Op - Update operation

<LeeF> Req - Update request

<AxelPolleres> rather, a graphstore is a "state machine" that moves from one state to another by update operations.

<LeeF> Op - a function from GSS -> GSS

That is the service that is the state machine? Or is that the same?

<AxelPolleres> so, op is the state transition function, which semantics we have to define in terms of pre and post state

<LeeF> Req is Op1, Op2, Op3 - GSS -> GSS - Op3(Op2(Op1(GSS0)))

<pgearon> +1

<sandro> +1

<AxelPolleres> looks ok

<AlexPassant> +1

(it says to me the Req is atomic ++)

<LeeF> LOAD - http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#t417

<LeeF> "http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#t417"

<LeeF> LOAD <documentURI> [ INTO GRAPH <uri> ]

<AxelPolleres> LOAD <documentURI> [ INTO GRAPH <gnew> ]

<AxelPolleres> this means

<AxelPolleres> DS_{t+1} = DS_t union {(<gnew>_t, G_<gnew>)}

<AxelPolleres> something like that...

<SteveH> why <gnew>_t ?

DS1 is DS[0 with G replaced by GET <uri>

<AxelPolleres> DS_{t+1} = DS_t union {(<gnew>, G_<gnew>)}

scribe: notation for graph in datatset? DS[g]?

<AxelPolleres> DS_{t+1} = DS_t union {(<gnew>, G_<documentURI>)}

DS_{t+1} = DS_t \ {(<v>, old)} union {(<gnew>, G_<gnew>)}

<AxelPolleres> if graph is already there...

<AxelPolleres> LOAD <documentURI> [ INTO GRAPH <g> ]

<AxelPolleres> means

just take the old one out regard less with \ {(<g>, any)}

<SteveH> +1

<AxelPolleres> DS_t { .... (<g>, G) ... }

<SteveH> some people have triplestore, not quadstores

<AxelPolleres> DS_t+1 = { (<g>, G merge G_<documentURI>) }

<LeeF> GRAPH ?g { } # test if graph exists

<AxelPolleres> or alike

<LeeF> Sandro: is the difference between pairs and quads the existence of empty graphs?

<LeeF> AndyS: yes

<LeeF> sandro: is this observable?

<LeeF> AndyS: yes

CREATE GRAPH <uri> should cause GRAPH ?g {} to show <uri>

<SteveH> AndyS, could, not should

<LeeF> sandro: a no-op for CREATE will pass whatever test we have since implementations are allowed to prune empty graphs, so we should be able to define the semantics using quads

<SteveH> I agree with sandro, about defining in terms of quads would work, but I'm not sure it's easier

<AxelPolleres> do we need to devlare the behaviour of operations for both graph-aware and non-graph-aware stores?

Quads are an implmentation technique we should respect but the defn should be graphs as RDF is about graphs.

Quads have problems with the default graph.

<LeeF> sandro: do we have any way in the test suite setup of saying that either of 2 results would be ok?

<LeeF> AndyS: we have 1 case of that right now in query

<LeeF> AndyS: there are 2 different tests to capture both possibilities

<LeeF> AndyS: we could divide places where they are different into 2 sets and implementers choose which set to run

<SteveH> +1 to not writing tests where their different :)

<LeeF> ... other option is not to write the tests where they are different

<Zakim> AxelPolleres, you wanted to ask/talk about update test cases

Another matter for quads is the lack of std quads format.

LeeF: could have two tests, one with pruning, one without

Axel: tests defined in state-before, and state-after and mark tests "for graph aware GS"

<AxelPolleres> SELECT ?S ?P ?O WHERE { ?S ?P ?O OPTIONAL { GRAPH ?G {?S ?P ?O } } } would be an (ugly) way to query the graphstore state, if we assume graphstore tied to default dataset

<SteveH> ...what about if we said GRAPH ?g { } always returned nothing?

<iv_an_ru> +1

<LeeF> SteveH, that would be a nonstarter for me, I believe

<SteveH> ok

<LeeF> CREATE GRAPH G --> if G exists, error or nothing ; else if GSS0 = (UG0, NGS0, T0) then CREATE(GSS0, G) = (UG0, NGS0 union (G, {}), T0+1)

<AxelPolleres> in test cases we might need to distunguish the behavior of graph-aware and non-graph-aware update-endpoints

Confusing if GRAPH ?g { } = nothing but GRAPH ?G {?S ?P ?O } something

<SteveH> not really, but doesn't seem to be a popular idea

<iv_an_ru> AxelPolleres, agree re. tests.

<LeeF> AndyS: the way to do this is to define the operations with empty graphs, and then note that implicit pruning may go on

<SteveH> I don't like the term "graph aware" FWIW, I think it's confusing

Axel: section on "graph aware" and "graph pruning" stores

<SteveH> or two sets of results, either of which is ok

(better to mention graph graphs)

sandro: concern of interoperability in the market

LeeF: looks like there is some work to be done but general direction is appearing

<sandro> sandro: let's not formalize two classes of sparql end points (graph aware and non-graph aware), eg in the test suite, since doing so would fragment the market.

pattern -> solution sequence -> instantiate a sets of triples for each GRAPH mentioned (and dft graph) -> make changes

<pgearon> yes

<AlexPassant> sounds ok for me

<pgearon> missing considerations is my only concern at this point, but I can't think of any at the moment

<LeeF> AndyS, thanks for scribing

<pgearon> OK, thank you

<SteveH> bye

<AlexPassant> RRSAgent: create minutes

- DRAFT -

SPARQL WG dedicated meeting on formal update model

30 Jul 2010

Attendees

Contents

Summary of Action Items

Scribe.perl diagnostic output