Provenance Incubator Group Teleconference -- 25 Apr 2010

<trackbot> Date: 25 April 2010

<raphael> trackbot, start telecon

<trackbot> Meeting: Provenance Incubator Group Teleconference

<trackbot> Date: 25 April 2010

1. Agenda review

<raphael> scribenick: raphael

Yolanda: I will collect fees for the workshop, 90$ each or perhaps less, I have receipts

Chris: could we do a quick introduction each?

Sam: IBBT in Ghent, PhD, aggregating information, LOD, learning objects, social graphs ... need for provenance in all my projects

Yogesh: PostDoc, e-Science group in Microsoft, scientific workflow, how provenance can be used in clusters, clouds

Raphael: EURECOM, France, multimedia semantics, study semantics for multimedia annotation, multimedia content delivery, multimedia personalisation

Yolanda: intersted in 3 aspects: provenance in workflow, provenance for capturing human decision, provenance for content trust

Christine: Australia, Internet Policy, would be happy to bring a legal aspect to this group (privacy)

Olaf: Berlin (DB group), I'm trying to answer query over distributed information, interested in information quality ... need provenance

Paul: VU Amsterdam, postdoc, worked on various provenance challenges, interested in provenance for mashup, particuarly e-science

Sathia: semantic web, provenance for the scientific community

Jun: Oxford, sw + web science, provenance for linked data (work with Olaf) and implement the OPM model for data.gov.uk

Chris: University of Berlin, started the LOD project, worked on Named Graph with Jeremy Carroll, work on information quality assessment

<pgroth> Jun = Jun

Andreas: Karlsruhe, I'm crawling SW since 5+ years, bringing together database, sensors data and SW technologies, need for provenance

Raphael: what is your policy for the documents you want to publish? There is the executive summary, self-contained, short ... but would you like to publish an exhaustive document containing all use cases and requirements?

Paul: not really readable at this moment for people outside of the group ... so not really plan

Yolanda: there is a single document for all use cases: http://www.w3.org/2005/Incubator/prov/wiki/Requirements
... no, http://www.w3.org/2005/Incubator/prov/wiki/Use_Case_Report

Andreas: it reads really well, now is the time to link to technical solutions, I guess would happen in next document

Yolanda: I'm interested in multumedia requirements and also digital signatures

Paul: I have talked to Dan Brickley from the SWXG about this
... a lot of techies people, that might cover just an aspect of the provenance problem space and come up with their own solution
... can be the liaison, but hard to get their attention

Chris: well, this is perhaps a indication for us to work more on the technical aspects too
... to appear in their radar
... state of the art document is good, but how to go forward now ?
... feel there is consensus on the OPM model
... but we could go forward
... I think there is some frustration from people who would like to test things now

Yolanda: having a document that explains the mappings between various models such as Provenance Vocabulary, OPM, etc. is actually our plan
... we might not go further

Chris: we have technical people in the group, deployment people, and there is urgent pressure from the LOD community, what I suggest is to suggest technical solutions, simple vocabulary that we agreed on, or mappings between vocabularies

Paul: we have now the requirements, but we haven't done yet the gap analysis, i.e. see which technology match which requirements
... perhaps this could be our next step

Chris: I belive the soa document should contain some mappings between the various models

Yolanda: we have the user requirements report ... our timeline schedule a soa report (end of June) and a roadmap (end of September)

Paul: assuming OPM at the core, we have mappings between PML, Provenance Vocabulary, Provenir, PREMIS and OPM

Yolanda: perhaps we could start collecting in a single place these mappings, even if we do not cover all the list of related technologies, but we have a start

Yogesh: should we not first do the gap analysis, i.e. check which model fits which use cases before working on the mappings?

Yolanda: I think we should do both

<kerry> All, I am afraid that I need to leave now. Bye and thanks.

<pgroth> thanks kerry for joining

<pgroth> hope it was useful

Raphael: take inspiration of the mappings tables from the Media Ontology, http://www.w3.org/TR/mediaont-10/
... another suggestion is to take a concrete example from the use case document and try to represent the provenance information with each models and ontologies
... in order to compare the resulting representations or rdf graphs

Jun: yes, we intend to do that, and give feedback to authors of models of which use cases are realizable or not

Chris: should not be a task for the Provenance challenge?

2. Gap analysis study

<scribe> scribenick: jun

Yolanda: update on the discussion of the state of art document

Yoland: Luc proposed to use a matrix to do a technology gap analysis

Yolanda: not only the gap but also the strength and weakness of these technologies
... a matrix is a systematic way to organize things
... Luc also suggested that we should have some sense of goals/scope to anchor, e.g., what my needs are
... Paolo suggested to use the RDF next step paper as a starting example for the exercise
... we will use the three use case scenarios to see how they cover your goals
... Luc suggested a block diagram for each use case, an architecture picture, of how provenance pieces could fit together for a use case
... maybe there could some themes in the use case scenario, to reflect the goals, to help us to have a better idea about "goals"
... we talked more about how the matrix could look like
... maybe we can look at one particular scenario, to go through the exercise
... take one scenario, articulate tech. requirements, the goals; to draw some architecture, provenance solution for the scenario; and then look at relevant technologies and systems

2 Gap analysis study

Yolanda: James also suggested to keep the content in the matrix brief, and link to the "why"s

Chris: suggested to define the matrix for aligning vocabularies

3 definition of provenance

Yolanda: everybody puts their def. on the wiki page: what is provenance

Chris: maybe someone has done a survey of provenance definition?

Paul: Luc has a pretty good collection of the definition and will point to his chapter on the wiki

4 define the matrix for provenance vocabulary mapping

<raphael> s//

<raphael> scribenick: Yolanda

<jun> exit

<raphael> scribenick: Yogesh

Sathya: Start with set of common terms? Four primary vocabolaries

(Starting to discuss Provenance_Vocabulary_Mappings)

http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary_Mappings

Paul: Listing provenance vocabularies at the wiki page...

Jun: FOAF has provenance vocabulary. We'll add that later.

Need to link the vocabularies with URL of technology in the wiki

Satya: Start listing common set of terms rather than complete listing, given limited time

Chris: Delegate description of vocabularies in the Wiki to experts on the model

Paul: List of provenance vocabularies is a subset of the list of technolgies available at http://www.w3.org/2005/Incubator/prov/wiki/Relevant_Technologies

Chris: Do we have experts on the vocab list present here?

Satya: Yes

Raphael: Use mapping table similar to Media Ontology

Paul: Show relation between list of terms and their existence in the vocabulary

Chris: How do we get this list?

Satya: Manually go through the vocab schemas and list them out.

Chris: Do we distinguish between property and class for the terms?

Olaf: Property for may be class for another. So lets not distinguish.

Raphael: Add namespace prefix to each model
... Wiki editing tips for tables...

Paul: We may want a link to a document that describes the formal mapping, if it exists

Christine: Can we get a definition for the term? E.g. link to it

Chris: Can we have a temporary model term to map to?

Yolanda: Use opm as the "standard" term and then map others to them?

Chris: Use opm term if available and if not, that identifies gap in opm

Christine: Do we need to explain why we choose OPM as the reference model?

Yolanda: Not right now

Jun: OPM's OWL serialization?

Paul: No, the OPM model

Yogesh: List all OPM terms in the wiki so it can be used a template

Yolanda: Do dublin core terms map to agent?

Paul: Other vocabs will also have a high level equivalent of agent. Not as specific as dublin core agent.

<Yolanda> A good table to look at from the multimedia ontology mappings is #21: http://www.w3.org/TR/mediaont-10/EBUCore.html

Chris: Deadline for finishing the mapping?

Satya: Is incharge of this.
... 1 month to fill in first draft
... May 15th for first draft

Only take those terms relevant to provenance

Satya has final say on whether to add a new reference term

breaking for lunch

<pgroth> back from lunch

Yolanda: Expand the mappings tomorrow. Moving on.

<ssahoo2> Yolanda: discussing use case in the requirements document

<ssahoo2> Yolanda: use case 1: news data aggregation, case 2: disease outbreak, case 3: provenance in business contract

<ssahoo2> Yolanda: use case 1 selected by vote

<ssahoo2> Paul: identify goal of selecting the use case

User Requirements review

<ssahoo2> Yolanda

<ssahoo2> :identify goals

<ssahoo2> Yogesh: what does "technology" mean - model or implementation?

<ssahoo2> Paul: who are the audience?

<ssahoo2> Raphael: The AFP news agency sells a provenance toolkit to collect provenance of news entries from tweets etc.

<ssahoo2> Raphael: The AFP toolkit allows to identify the source author and time information associated with a news entity

<ssahoo2> Raphael: but it does not trace the exact path of the news entity

<ssahoo2> Paul: The news aggregator use case does not address this scenario

<ssahoo2> Paul: example scenario for use case 1: check the license of all providers for news aggregator

<ssahoo2> Yolanda: this use case is more about unstructured content integration

<ssahoo2> Satya: this scenario also includes RSS feed aggregators

<ssahoo2> Paul: query from user for this use case - how to track where the information is sources

<ssahoo2> Chris: provenance of a blog - ability to track back to the original source of the information

<ssahoo2> Jun: user care about freshness of the data and the available technology to achieve this

<ssahoo2> Jun: entry point for the use cases can be topics of the use case

<ssahoo2> Paul: entry points should balance between being too specific about provenance or use case domain

<ssahoo2> Paul: allows conveying the provenance of the content to user

<ssahoo2> Paul: Scalability of provenance systems for Web news application

<ssahoo2> Yolanda: entry points will overlap across use cases

<ssahoo2> Satya: the entry points may be driven by the context of the use case or domain

<ssahoo2> Yolanda: identify the technical requirements for the entry points of use case 1

<ssahoo2> Yogesh: start with existing user requirements and remove requirements that are not specific to the use case - to identify user requirements for the entry points of the use case

<ssahoo2> Paul: The technical requirements are not curated in the use cases

<ssahoo2> Yolanda: the user requirements are curated but technical requirements may need to be reviewed

<ssahoo2> Yogesh to review the technical requirements for the use cases

<ssahoo2> Yogesh: entry points can be separated from technical requirements

<ssahoo2> Jun: understanding technical requirement is easier in context of specific user requirements

<ssahoo2> Yogesh to edit the requirements page to reflect user requirement

<ssahoo2> Technical requirements added for entry points of use case 1

<ssahoo2> Paul/Satya: nesting of meta provenance - provenance of provenance grounds out according to requirements of the application

- DRAFT -

Provenance Incubator Group Teleconference

25 Apr 2010

Attendees

Contents