Telecon 25.09.2013

From RDF Stream Processing Community Group

Participants

  • Jacopo Urbani
  • Manfred Hauswirth
  • Jean-Paul Calbimonte
  • Oscar Corcho (scribe)
  • Alasdair Gray
  • Emanuele Della-Valle
  • Thomas Scharrenbach
  • Marco Balduini
  • Daniele Dell'Aglio
  • Esko Nuutila
  • Lorenz Fischer
  • Robin KeskisŠärkkŠä
  • Jim Smith
  • Alessandro Margara
  • Mikko Rinne
  • Sebastian Käbisch

Agenda

  • RDF Stream Models overview
    • Order in RDF data
    • Timestamps on triples and graphs
    • Different types (and semantics) of timestamps
    • Point in time and interval timestamps
  • Data Streams and (complex) Events
  • Serialization
  • Discussion

Minutes

RDF Stream Models overview

  • Presentation by Daniele Dell'Aglio [1]

Oscar Corcho :Jean Paul presenting a summary of what will be presented in the RDF Stream tutorial at the next ISWC2013

Oscar Corcho: @Daniele, can you please input here the URL of the slides?

Oscar Corcho :Now Daniele taking over on the presentation

Oscar Corcho :slides at http://www.dellaglio.org/uploads/rsp-phone-call-0925.pdf

Mikko Rinne :Comment on the presentation: A data item may have multiple timestamps, e.g. time of observation, time of entering the stream (denoted application time in the presentation) and time of entering the stream processing system.

Oscar Corcho :@Mikko, I agree (in fact this is something that should be made clear as well)

Alasdair Gray :@Mikko, I agree also. This was widely studied in Relational DSMS.

Alasdair Gray :The idea of timestamp ranges was widely studied in temporal databases.


Event Processing ODP

  • Work on Event modeling by Aalto and Linköping.
  • Presentation by Mikko [2]

Oscar Corcho :Next presentation from Mikko

Oscar Corcho :Thanks Mikko

Preliminary work by UZH

  • Presentation by Thomas [?]

Oscar Corcho :Next presentation from Thomas

Oscar Corcho :ACTION: Thomas to upload the slides, if they are not in the wiki

Darko Anicic :+1 for Thomas definition on Events and Facts

Thomas: describing simple events, complex events and facts, and their characteristics

Oscar Corcho :Thomas: discussing about serialization

Mikko Rinne :Comment on the definition of complex events: Complex event objects are synthesized events, triggered within the event processing system. Even though the triggering events may have occurred (or have been observed) at different times, there is always a singular point in time with the precision available in the event processing system, at which time all the required events and facts aligned to trigger the complex event. Any event may trigger a state transition, which will persist as long as another event reverses the transition. I have yet to see a good example on why events would have to have durations, but the state I referred above could be considered a "synthetic fact" in the system.

from James Smith :my vote is to develop with at least two different serializations in mind (e.g., XML and JSON-LD) to make sure we don't incorporate assumptions from the serialization format that might bite us later

Thomas: discussing about whether we want to deal with graphs or just triples, pros and cons

Alasdair Gray :Can simple events always be captured in a single triple? Guess it depends on your modelling.

Lorenz Fischer :+1 to different serializations

Oscar Corcho :For example, with the event model that Mikko has presented, or using SSN (if we consider an Observation as a simple event) these simple events are always sets of triples

Alasdair Gray :For example, a sensor reading you need to know which sensor, at what time, and in what units.

Jean-Paul :this has been a source of problems with things like triple based windows

Thomas Scharrenbach :yet, you need to have en event model only, if you want to make use of this meta-information.

RDF Stream models:Discussion

Manfred: assumptions that we have to consider: 1) there are differences between different types of time (observation, system, etc.)

Manfred: with the time of generation, we have to consider that synchronisation between embedded systems is difficult

Manfred: interesting to see how differetn use cases/domains need different types and amounts of timestamps

Manfred (Oscar's own words now): be descriptive in the group, not prescriptive, when discussing which models we can use

Jean-Paul :how to define timestamps: should be flexible...

James Smith :some data models can embed some timestamps in the graph while others will be outside the graph. e.g., the generation time might be within the graph, but the consumption time by a processor might be outside the graph (thinking of Open Annotation and PROV-O)

Manfred: we need to tell explicitly outsiders which are the different types of models that we can apply, and how they could be used in different use cases

Mikko Rinne :@Jean-Paul: Exactly, but the backside of flexibility is complexity. The trick is how to be flexible, but still enable stream processing systems to simply understand the format.

Oscar Corcho :ACTION (for the group): describe clearly the different types of design decisions about time/timestamps, and relate them to the use cases

Jean-Paul :+1

Representation and Serialization: Discussion

Manfred: the representation of time comes later (in embedded systems the representation may need to be very condensed, because of limitations in devices, while in other systems it may be more verbose)

Darko Anicic :+1 with Manfred about compact format representation

Sebastian: what about proposing a serialisation format that could be useful independently of the devices/systems to be used?

Manfred: it would be great, but be careful as it will be difficult

Manfred: we have to consider existing installations that are already out there

Manfred: comments on experience on SPITFire with tiny sensors and RDF, and need for solutions that are more efficient

Sebastian: proposes to test/evaluate different serialisation options under the umbrella of the group

Oscar Corcho :ACTION (Jean Paul): create also another wiki page about serialisations

Thomas Scharrenbach :My point was that we should keep in mind serialization, one possible candidate bein XML. Comsuming data (DE-serialization) should be as flexible as possible.

Oscar Corcho :Both are important, in fact

Models: Semantics, metamodels and ontologies: Discussion

Darko: explains the importanceof semantics and reltion with the model definition

Darko Anicic :e1 before (e2 before e3)

Darko Anicic :e1 before e2 before e3

Manfred Hauswirth :+1 on smenatics of the model (Darko) - that is really important!

Darko Anicic : e2 before e1 before e3

Darko: need to be aware of the semantics of the language that is built on top of the selected data model for timestamps

Darko: on format/serialisation, it should be applicable at different levels of processing (from embedded systems to systems on the cloud)

Alessandro Margara :+1 It's important to think about the semantics of processing when defining the data/time model

Darko: importance of compact representation for been able to process streams efficiently

Manfred Hauswirth :I am sorry, I have to leave now - budget meeting - better not miss that :-) Will read the minutes and comment on the Wiki! Cheers, Manfred

Oscar Corcho :@all: in fact, as we normall schedule these meetings to last for one hour, we should try to wrap up… It is a good sign that we had 3 presentations prepared for today.

Oscar Corcho :Thomas: <<sorry, I missed the point that you were making. Please add it later>>

Oscar Corcho :Emanuele: We are now at the point of understanding the model, and then probably it is better to go later for the serialisation (if they are orthogonal, they can be run at different moments)

Jean Paul: triples vs graphs is a relevant discussion, timestamps is also a relevant discussion related to this. We should continue working on that

Robin Keskisärkkä :Regarding timestamps: If not accessible through the query language, how can we express intervals between events? (e.g. "If A happened within 5m of B ...")

Oscar Corcho :ACTION (all who have talked today): include more discussions on the group wiki

Jean Paul: in the following call, we will follow a similar approach to today's

Thomas: question to Mikko. should your event representation model be part of a potential standard, or should a standard allow you to define such a model?

Emanuele: Mikko's work seem to be more in the space of what was done with the SSN ontology. Whether such type of data should be represented as data or metadata, it's part of the discussion of the group. The same applies to Thomas' discussion on simple and complex events and facts.

Emanuele: it would be probably a bit out of context to go into specific ontologies/vocabularies, although those can be useful for the discussion on the data model

James Smith :I need to run... but I'll keep an eye on the wiki

Mikko: Ultimately I am looking for compatibility between streaming platforms. I'm not too trigger-happy to extend voluminous standards, which already suffer from internal incoherences, so other means should also be investigated. That could mean e.g. recommendations to apply certain conventions, allowing us to publish streams.

Emanuele, Marco, Daniele (Polimi) :+1

Thomas: should we be working on how to represent events (metamodels, etc.) or just share a common view?

Emanuele: we can expect to create some metamodel and a set of examples. Exchanging models is useful

Oscar Corcho :Sorry, I need to leave!! Please somebody else to continue scribing if possible...

Thomas: do we require meta information from meta-model events?

Emanuele: meta model: we put timestamps (flexible, one, two) to graphs. something you use to transmit the model

Alasdair Gray :Do we have a clear set of use cases that we are trying to support? This is essential to deciding what the model and the query language extensions we require are.

Lorenz: will this debate become clearer when we talk about query procesing

Thomas Scharrenbach :+1

Sebastian Käbisch :Sorry, I have to go! See u in the next web meeting

Robin Keskisärkkä :Bye bye.

Lorenz Fischer :thanks everyone. good bye!


Actions

  • ACTION (Thomas): upload the presentation.
  • ACTION (JP): Create wiki page for serializations
  • ACTION (all): Describe clearly the different types of design decisions about time/timestamps, and relate them to the use cases