Telecon 07.02.2014

From RDF Stream Processing Community Group

Participants

  • Alasdair Gray
  • Jean-Paul Calbimonte
  • Oscar Corcho
  • Marco Balduini
  • Daniele Dell'Aglio
  • Danh Le Phuoc
  • Robin Keskisärkkä
  • Roland Stühmer
  • Monika Solanki
  • Alessandra Mileo

Minutes

Use cases and query requirements

  • Oscar: Jean Paul showing some summary slides
  • Jean Paul: we only have query examples for two of the use cases. We should improve this
  • Oscar: ACTION (all):make sure to provide those query examples for the next meeting
  • Oscar: Thanks to Emanuele for providing an updated description of the RDF Stream Processors (wikified by Jean Paul)
  • Alasdair Gray: Can we add the table comparing stream languages to the wiki page
  • Jean Paul: Objective: We need to check whether from the use cases that are provided and their queries, the current categories used to compare systems is enough
  • Alasdair Gray: It's a really nice summary
  • Monika: +1
  • Oscar: assign use cases to people
  • Oscar: General discussion about how to obtain more queries from the use cases. Sometimes it is difficult since the data is not yet available in some cases.
  • Oscar: ACTION (JeanPaul): be more explicit on the assignment of the task of creating queries for use cases, so as to make sure that we can progress a bit quicker
  • Oscar: A natural language description of the queries could be enough.

Initial discussion on time window semantics

  • Oscar: Next item: let's take a look at the current queries
  • Oscar: [slide 4 from JeanPaul's presentation]: a summary of features obtained from the processors
  • Jean Paul: most of the queries seen in the wiki so far can be solved with time windows
  • Alasdair Gray: Raises had
  • Oscar: Semantics of time windows described in slide 6.
  • Oscar: Probably they will not be very different from what we will need
  • Alasdair: were we considering a graph-based model instead of a triple-based model?
  • Alasdair: should we then consider streams of triples or streams of graphs?
  • Oscar: The current definition shown by Jean Paul should go into annotating graphs instead
  • Alasdair Gray: We need to extend the existing approaches from a stream of triples approach to a stream of graphs
  • Robin Keskisärkkä: +1
  • Oscar: ACTION: work on the provision of semantics for streams of graphs
  • Alasdair: how would you represent the timestamp for a graph? By <graphURI,propertyX,timestamp>?
  • Daniele: Can we have more than one graph at a given timestamp?

Roland Stühmer (FZI): Oscar: +1 (plus an option for adding another time stamp optionally for intervals)

  • Oscar: We should include this in the checklist that we will use for the validation of the use cases
  • Alasdair Gray: Reminder: Graph captures all triples about some event, it has the timestamp that it was created/entered the system, the graph URI and timestamp need to be related by some predicate.
  • Alasdair Gray: There can be multiple graphs with the same timestamp
  • Roland Stühmer (FZI): agreed
  • Alasdair Gray: Happier to help validate, really don't have time to generate
  • Oscar: ACTION (Jean Paul, with Alasdair and Daniele validating): work on the timestamp annotation of graphs
  • Daniele: ok
  • Danh: I'm happy contribute as well
  • Oscar: thanks, Danh

Discussion on triple/event (counting) windows

  • Jean Paul: now moving to triple(tuple count)-based windows, which are more problematic
  • Daniele: we have to check how this would work with graphs instead of triples
  • Jean Paul: we need to think on the kind of queries that we have for this
  • Alasdair: the problem with tuple-based is that you do not have necessarily all th einformation needed
  • Alasdair Gray: In the graph-based model we would have all the information so we cannot throw it out on an inaccuracy argument
  • Daniele (PoliMI): +1
  • Alasdair Gray: Quad model does not solve triple-window
  • Alasdair Gray: unless you only allow the count over the context of the triple.
  • Alasdair Gray: only
  • Oscar: <<This discussion is going to be difficult to write as minutes ;-). I suggest that Danh provides later a further description over the mailing list, and also for Alasdair to provide a reference to the example that he has given>>
  • Oscar: Discussion: does it make sense to count the number of graphs/events?
  • Alasdair Gray: Example: quad stream where there are 10 elements in one context and 20 elements in another and a window size of 5, how many elements should there be in the window?
  • Oscar: thanks Alasdair
  • Alasdair Gray: I would argue that there should only be 2 elements
  • Monika: I agree, My use case supports exactly this scenario
  • Roland Stühmer (FZI): I would say there are 30 elements (in two contexts)
  • Oscar: ok, it seems that we have some good examples that we can use for this
  • Danh: nice use cases, I'll look into it
  • Oscar: very good discussion...
  • Daniele (PoliMI): I think that the point is: what is an element? a graph or a statement?
  • Alasdair Gray: @Roland: there would be 30 triples in the window of 2 elements
  • Roland Stühmer (FZI): yes
  • Oscar: So this means that we have to be very explicit, when we come into discussing tuple-based windows, on whether we want to count triples or graphs
  • Roland Stühmer (FZI): 2 graphs (less than the limit of 5)
  • from Alasdair Gray: (Sorry I mixed my terminology)
  • Daniele: what would happen if we realise from the use cases that we need to count in some cases triples and in some cases graphs?
  • Jean Paul: as far as I have seen so far it makes more sense to deal with graphs, but let's check with the use cases
  • Alasdair Gray: I'm not against counting the elements in the graph, there you can use the SPARQL count operator, but when we are putting things in windows, we need to ensure that we have all the elements included that relate to the same event
  • Roland Stühmer (FZI): +1
  • Oscar: ACTION (timeline to be decided): when we go into the tuple-based windows, we will have to resume this discussion. This should be based on the use cases
  • Robin Keskisärkkä:

A graph could be any size (even a single triple). If necessary, could a graph be decomposed to create such a stream of single triples as graphs?

  • Jean Paul (slide 7): there are other features that we should not forget, since it comes from the use cases
  • Daniele (PoliMI): In general, the "statement-based window" is a special case where each graph contains only one statement. So, I think that streams of graph can also represent streams of statements
  • Danh: it's good idead to link this issue/action to usecases

Next meetings and others

  • Oscar: ACTION (Jean Paul): prepare the checklist so that it can be used to characterise the queries from the use cases
  • Oscar: Objective for the meeting in two weeks: work on the time-based windows
  • Oscar: plus the checklist
  • Oscar: Next meeting: in two weeks 21st February

Actions

  • ACTION (Danh, Jean Paul, with Alasdair and Daniele validating): work on graph time window semantics
  • ACTION (JeanPaul): be more explicit on the assignment of the task of creating queries for use cases, so as to make sure that we can progress a bit quicker
  • ACTION (Jean Paul): prepare the checklist so that it can be used to characterise the queries from the use cases
  • ACTION (all):make sure to provide those query examples for the next meeting