Telecon 20.12.2013

From RDF Stream Processing Community Group

Participants

  • Alasdair Gray
  • Jean-Paul Calbimonte
  • Daniele Dell'Aglio
  • Josiane X. Parreira
  • Jesus Arias
  • Mikko Rinne
  • Emanuele Della Valle
  • Darko Anicic
  • Jim Smith
  • Danh Le Phuoc
  • Monika Solanki
  • Roland Stühmer
  • Robin Keskisärkkä


Minutes

Discussions on the Model: timestamps and graphs

  • Jean Paul: Discuss existing model in the wiki

http://www.w3.org/community/rsp/wiki/RDF_Stream_Model

  • Jean Paul:taking a look at the model
  • Josi: been working on the definition
  • Josi: sequence of graphs, not just tripels
  • Josi: already agreed on graphs
  • Josi: propose to work out on the tiemestamp
  • Jean Paul: intervals vs one timestamp
  • Josi: dont need to pick one over the other
  • Josi: but need a clear mapping between the two, so to convert from one to the other in unambiguous way
  • Darko: eplain better this mapping
  • Josi:if you cannot map the two, there are problems in the evaluation semantics
  • Josi:example semantics: o :timeStamp ti exists, for all t1<=ti <=t2
  • Darko: in favor of having the interval proposal. One timestamp and the other is optional
  • Darko: we can later define semantics
  • Darko: Mikko pointed to his time model. It says we may have n timestamps. Nothing against Mikko's proposal, but how practical is it? In favor of one time point and an optional one
  • Darko: we do not need a mapping from two to timestamps.
  • Josi: But we need to Allow for both representations
  • Josi: what if we need to process both, the engine might allow both as input for a query
  • Darko: Discuss the semantics later
  • Josi: Agree. But keep in mind. maybe talk later.
  • Alasdair: Validity and other types of timestamp, should be discussed in the semantics
  • Emanuele: agree. You can timestamp the graphs at the reeiving time. This is a realistic possibility. Some engines consider the arriving time as the timestamp
  • Darko: Agree
  • Danh: In Some situations is not realistic. You do not know when the input will end
  • Danh: you need to process the data as soon as it is available
  • Danh: You need to generalize the time label for the input
  • Alasdair: Danh, can you give a concrete example?
  • Danh: I some situation the time label can be both: single timestamp or interval
  • Danh: in some queries you don't need the exact timestamps.
  • Danh: you just need to check the validity over a time windows with a given order
  • Darko: For example count windows
  • Josi: no conclusions in this particular point so far
  • Alasdair:Do you mean something like select COUNT(*) from <stream>[5];?
  • Danh: better to generalize, than to fix the timestamp
  • Josi: so how we could define this?
  • DAnh: define abstract timestamp (general), later map to a particular timestamp
  • Jean Paul too generic, then what's the interpratation of the evaluation semantics
  • Danh: Agree, if too generic might be hard to define semantics
  • Josi: if the semantics is clear, it is an option
  • Danh: start with simple and interval. Might be enough for now. Later extendable
  • Alasdair: Is a timeinterval a shorthand for republishing the graph at every time point in the interval?
  • Josi: If we could map these timestamps, but if it's too generic and no mapping is clear, could be problematic
  • Josi: this is the interpretation we have for the moment
  • Alasdair: if that is the interpretation, an interval is af it is republish at every time point
  • James Smith: in my mind, we can develop this with timestamps, identify the aspects of timestamps that are necessary for things to work, and then folks can map the model to their use case if they have something that isn't time that can satisfy all of the requirements that time satisfies -- e.g., a timestamp is a monotincally increasing sequence
  • Emanuele: this is consistent with the ISTREAM operator in CQL
  • Emanuele: keep repeating tthings over time, or not (under the hypothesis that the time is discrete)
  • Monika Solanki:

Should there be an interpretation defined for the timestamp/time Interval for an RDF graph and the individual triples in the graph, which may have their own timestamps?

  • Emanuele: question on aggregates
  • Emanuele: aggregate query example where we may have problems. Things can get more complex
  • Jean Paul: this may go into the querying telco

Vocabulary for timestamps

  • Jean Paul: Work to define Vocabulary for timestamps?
  • Emanuele: commit to work on this
  • Alasdair: This is the email message http://lists.w3.org/Archives/Public/public-rsp/2013Dec/0015.html
  • Monika Solanki: Happy to contribute to the vocabulary definitions
  • Danh Le Phuoc: I'll be happy to contribute the vocabulary definitions as well
  • Alasdair: Happy to contribute but on holiday until 6 Jan
  • Mikko Rinne: The Event Processing ODP, as listed in the group background, is one approach to vocabularies. But are we going to specify "a vocabulary", or are we going to say how to describe a stream, listing the vocabularies, that are being used in that specific stream?

Serialization issues

  • Danh: comment on vocabulary
  • Danh: transmission is important. If you do it repretedly, then if you use a vocabulary anotation for each observation, may have bandwidth problems
  • Danh: consider serialization problems
  • James Smith: but something like json-ld can help with bandwidth
  • DArko: Binary RDF, the goal is not to send verbose data, but binary values, and possibly compress them. May be useful for this topic
  • Darko: maybe support binary formats for RSP
  • Danh: Compress is ok, but still think adding triples for annotations might be not efficient
  • Danh: compression is orthogonal, can be a solution
  • Emanuele: smoehow the Istream approach is smart, good with stable connection. If you don't have stable connection, or not keep listening, Rstream is better.
  • Emanuele: there is never the right thing todo, it depends on cases
  • Emanuele: other acquisition (active protocols) exist, and are already there, and try to reduce the information exchange
  • Emanuele: Serialization problem is another thing
  • James Smith: +1 for model / serialization / stream compression being orthogonal
  • Alasdair: +1
  • Jean Paul:+1
  • Monika Solanki:+1
  • Darko: +1
  • Darko: we will have the topic on serialization and reducing information eschange
  • Emanuele: Istream does not give you all information about the stream state, you need Dstream as well
  • Danh Le Phuoc: +1 for the topic on serialization and reducing information exchange
  • Darko: e.g. Binary RDF and EXI


Next steps: querying RDF Streams

  • Jean Paul: next steps for RSP, next call
  • Emanuele: work on the syntax, cover the current languages
  • Alasdair: Querying sounds good

Use cases

  • Darko: status of use case?
  • Emanuele: Had a look at use cases
  • Monika Solanki: On my to-do list to contribute one on supply chains
  • Emanuele: most are data stream processing. Maybe have something a bit different?
  • Emanuele: recommend to add more, so that other use cases are represented
  • Monika: requirements of querying
  • Monika: windows defined only over time period or length
  • Monika: in my use case windows defined differently, over rules
  • Alasdair Gray to Everyone: Monika, can you add your needs to the use cases http://www.w3.org/community/rsp/wiki/Use_cases?
  • Josi: this is on the query side
  • Emanuele: concern on the use cases, we need to create examples
  • Jean Paul: please add query examples in the use cases, will be useful for the semantics on the query evaluation model.

Other

  • Emanuele: thank you everyone. We are trying to build something stronger than before.
  • Emanuele: and merry Xmas
  • Roland: Introduce myself!
  • Alasdair: Bye all, have a nice Christmas break.
  • Robin Keskisärkkä: Happy holidays!
  • Monika Solanki: Thanks everyone, have a relaxing break!!
  • Mikko Rinne: Happy holidays! Until next year!
  • Danh Le Phuoc: Merry Xmas and Happy New Year !!!


Actions

  • JP, all: Update wiki page on RSP models, interval timestamps (an initial one with an optional one)
  • All: update use cases: add query examples and/or query requirements, needed for next call on RSP querying.