Telecon 13.02.2015

From RDF Stream Processing Community Group

Participants

  • Alessandra Mileo (AM)
  • Bernhard Ortner
  • Danh le Phuoc (DLP)
  • Daniele Dell'Aglio (DDA)
  • Darko Anicic
  • Fariz Darari
  • Jean-Paul Calbimonte (JPC)
  • Roland Stuhmer (RS)
  • Josi (Siemens)

Minutes taken by

  • Peter Wetz

Agenda

  • Open Actions from last time
    • ACTION(RS): make an example starting from the discussion
    • ACTION(DLP): to start a page on GitHub to begin the discussion about the schema
    • ACTION(EDV): to invite Danh in the GitHub RSP group

Minutes

  • Update regarding ESWC RSP Workshop
    • JPC: submission deadline for ESWC workshop has been postponed.
    • JPC: and results of the discussion are going to be published in the Satellite proceedings. as a workshop we are going to produce at least one publication that is going into these satellite proceedings.
    • PW: since i did not attend last week's meeting, what is the DEBS dataset about?
    • JPC: we had this long complex query a few weeks ago where everything was inside of one query. so the idea of DEBS dataset is to create new, simpler queries, which may highlight features of an RSP query language.
  • DEBS dataset
    • JPC: Danh did you have time to create query based on the DEBS dataset mentioned last time?
    • (technical difficulties. we don't hear danh)
    • DLP: I sent an e-mail to JPC because I had problems committing to Github.
    • DLP: we have a stream of the trips of taxis. dropoff times, money paid for the trip, etc.
    • (danh explains the data structure and possible queries based on the file shown by JPC)
    • JPC: one issue that i see is that we only have one stream. so for example if we want to query multiple streams, this won't be possible.
    • DLP: there are two stream: trip data and fare data.
    • AM: we may combine the taxi data with other available open data (weather, traffic) from the same time period.
    • AM: okay, there's potentially a lot of data sets to use. we need to find queries/scenarios that build upon the other data sets we decide to use.
    • AM: 1) which data set to use (should be easily convertible to RDF), 2) then decide which datasets to use. Danh do you know already some of these datasets?
    • DLP: they have weather, they have life traffic, etc.
    • DLP: people should write use case and think of the data we need. this should be done in parallel.
    • JPC: we don't need to do much data wrangling. we should see if the data fits and then start defining query use cases, or writing queries right away.
    • JPC: for instance, the query
    • ACTION (all): each should come up with 2 queries. each query should use at least 2 input sources (taxi data + anything else (nyc open data, e.g., weather, traffic))
    • we agree on posting the queries on github + comments/descriptions
    • ACTION (JPC): to post the github link for queries and create readme file with possible data sources
    • PW: what do we do after the queries?
    • JPC: we have queries + use cases then. then we need to talk about the semantics. we need to the semantics of the syntax we defined already. we have an idea what a query and operators mean, but we need to formalize it somehow.
  • Next phone call
    • 27th Feb 2015 1500CET

Actions

  • new actions
    • ACTION (JPC): to post the github link for queries and create readme file with possible data sources
    • ACTION (all): each should come up with 2 queries. each query should use at least 2 input sources (taxi data + anything else (nyc open data, e.g., weather, traffic))
  • Actions from last time
    • ACTION(RS): make an example starting from the discussion (Done, see Aggregate Clause)
    • ACTION(EDV): to invite Danh in the GitHub RSP group

Agreements

summarize agreements here, if any were made