Telecon 23.10.2015

Participants

Daniele Dell'Algio
Robin Keskisärkkä
Bernhard Ortner
Jean-Paul Calbimonte (Chair)
Alejandro Llaves
Minh Dao Tran
Tara Athan

Agenda

Definition of window functions
Multiple triples in metadata of timestamped graph
Examples of streams in the doc
Timeline for this work

Resources

Data Model Document: https://github.com/streamreasoning/RSP-QL/blob/master/Semantics.md
Serialisation Document: https://github.com/streamreasoning/RSP-QL/blob/master/Serialization.md

Minutes

Jean-Paul: https://github.com/streamreasoning/RSP-QL/blob/master/Semantics.md

Tara Athan: I am having audio problems - I would like to point out that the example uses CURIEs where the prefixes have not been defined. Also, it would be good to specify what syntax is being used - is this JSON-LD or something else?

Tara Athan: I think it is fine to use prefixes, but it is just necessary to mention what is intended.

Jean-Paul: i agree

Tara Athan: I would like to see the definitions of stream, substream and window finalized before addressing the details of the window function, because there is a dependence.

Tara Athan: The current defintion of stream says nothing about order.

Daniele (PoliMi): yes

Tara Athan: The definition of stream should, I believe, include a constraint about the timestamps. Not just any sequence of time-stamped graphs should be considered to be a stream.

Daniele (PoliMi): yes, in particular if it is possible to assign different timestamps (i don't know if that proposal is still valid)

Tara Athan: I think the last proposal was that each (time-stamping) predicate should have a partial order associated with it, that the time values must satisfy.

Minh: @Tara: yes

Jean-Paul: https://lists.w3.org/Archives/Public/public-rsp/2015Sep/0013.html

Daniele (PoliMi): actually ??, it was something more generic, to associate also other kind of annotations (e.g., generation time, transmission time, etc.)

Tara Athan: generation time, transmission time are the time-stamping predicates.

Daniele (PoliMi): indeed

Jean-Paul: https://lists.w3.org/Archives/Public/public-rsp/2015Sep/0012.html

Tara Athan: At present we only allow one such triple for each time-stamped graph. But it is possible to have the same graph have multiple occurrences in the stream.

Tara Athan: Example: :g1 {...}{:g1,prov:generatedAtTime,t1}

Tara Athan: :g1 {...}{:g1,prov:observedAtTime,t2}

Daniele (PoliMi): and is the content always the same

Daniele (PoliMi): ?

Daniele (PoliMi): content -> {...}

Tara Athan: If there is different content both given the same name :g1, then that is an inconsistency.

Daniele (PoliMi): ok

Tara Athan: Can we have a written proposal in the chat? THen we can vote on it.

Jean-Paul: the partial order of timestamps in a stream be specified on a predicate-by-predicate basis, as a way to allow greater generality of streams while still preserving the ability to merge arbitrary streams.

Jean-Paul: ordering in the stream is only with respect to timestamps of the same predicate.

Alejandro Llaves: But with this definition, if I have a data input with unordered observation time, is it not a stream?

Tara Athan: It is necessary to say that there is a unique partial order associated with each predicate. That is, it is not a user decision what partial order to use.

Tara Athan: @Alejandro - If the data violates that partial order associated with the predicate, then it is not a stream.

Alejandro Llaves: I.e., are we saying that a list of observations with unordered observation time is not a stream?

Tara Athan: It is theoretically possible to define a predicate with a trivial partial order - that nothing is comparable. Then you can have unordered data.

Minh: A stream S consists of a sequence of timestamped graphs whose elements sharing the same predicate are ordered by a partial order associated with this predicate on the timestamps.

Tara Athan: by a partial order => by the partial order

Daniele (PoliMi): what do you mean by "where"?

Tara Athan: Yes, it is important to specify the properties of the partial ordering. I made a proposal in that email.

Daniele (PoliMi): maybe i misunderstood your question

Tara Athan: If there is possible to have different partial orders for the same predicate, then it may not be possible to merge those streams.

Tara Athan: It is impossible to define the scope of an "application".

Daniele (PoliMi): the time on which the data arrives

Tara Athan: If the timestamp is the time it arrives, then the data *becomes* a stream once that timestamp is associated with it.

Daniele (PoliMi): we have different use cases where the data is ordered in this way

Robin Keskisärkkä: In practise this strict ordering would often require an intermediate step in which the stream "becomes" ordered with respect to some predicate (outside RSP). A typical case would be when sensor streams are processed in a cluster (e.g. streams with different partitions in Kafka, where there is an order that can become partially unordered when there is some network latency).

Tara Athan: Perhaps we should qualify our terminology. Say "RDF stream" rather than "stream"

Tara Athan: or "RDF time-stamped stream"

Tara Athan: 1. The usual mathematical requirements of a partial order apply (http://mathworld.wolfram.com/PartialOrder.html):a) Reflexivity X <= Xb) Antisymmetry X <= Y and Y <= X implies X = Yc) Transitivity X <= Y and Y <= Z implies X <= Z2. The partial order must respect the natural order of time.In particular, if every time instant within the closure of temporal entity X is earlier than every time instant within the closure of temporal entity Y, then X <= Y(where closure of a time instant t is defined as the degenerate interval [t, t], and closure of an interval is defined in the usual way)

Tara Athan: Some formatting was lost in the copy-paste.

Robin Keskisärkkä: A minor thing, but what is the motivation for the variables used in the window function definition (i.e. l and d)?

Robin Keskisärkkä: Typically we speak of duration, upper/lower bound, step

Robin Keskisärkkä: so it's kindow confusing when step = d

Tara Athan: Any finite stream can be considered in its entirety as an RDF dataset. I think it is sufficient to define queries on RDF datasets - we shouldn't need something special for streams.

Robin Keskisärkkä: kind of*

Robin Keskisärkkä: it's a minor thing

Tara Athan: The output of the window function is still a stream.

Tara Athan: So it is just a two-step process - filter the original stream by the window function, then apply the query to resulting substream.

Tara Athan: OK, so if it is a matter of the query language, then that is syntax, not semantics.

Daniele (PoliMi): it's a matter of semantics

Daniele (PoliMi): let's for example say that we want to eval a bgp p over the output of a window function

Daniele (PoliMi): to follow the sparql definition, the bgp is is applied to the active graph of the dataset

Tara Athan: "bgp p"?

Daniele (PoliMi): so we need a way to move from the output of the window function to a graph (that would be the active one in that case)

Daniele (PoliMi): p is a typo

Tara Athan: I still am not familiar with "bgp"

Daniele (PoliMi): basic graph pattern

Tara Athan: I would think that the query needs to be defined for an RDF dataset. How would you apply a bgp to an RDF dataset?

Daniele (PoliMi): sorry for the silly question, but is the rdf dataset the same dataset defined in the sparql spec?

Tara Athan: We are using this notion of RDF dataset: http://www.w3.org/TR/2014/NOTE-rdf11-datasets-20140225/#each-named-graph-defines-its-own-context

Daniele (PoliMi): if it is like this one: http://www.w3.org/TR/sparql11-query/#rdfDataset

Daniele (PoliMi): i can support minh on that

Tara Athan: I am not completely familiar with the SPARQL spec, but at first glance it looks like SPARQL does not commit to a particular semantics of RDF datasets, while we do.

Minh: ok, thanks everyone and have a nice weekend

Robin Keskisärkkä: bye

Alejandro Llaves: thanks, have a nice weekend!

Daniele (PoliMi): bye

Tara Athan: bye