The DataONE Working Group on Provenance

From Provenance WG Wiki
Jump to: navigation, search

The DataONE Provenance Working Group

DataONE is a large NSF-funded project on data preservation.

The project is organized as a core effort, plus a number of Working Groups: "The Working Group model allows DataONE to conduct targeted research and education activities with a broad group of scientists and users. Working Groups are also designed to enable research and education activites to evolve over time. Each Working Group will have two co-leaders who organize the activity and propose solutions to particular research, education, and cyberinfrastructure problems."

The Provenance Working Group is led by Prof. Bertram Ludascher and Paolo Missier. The primary goal of the WG is to investigate the role of workflow-based provenance in the DataONE use cases, and to formulate a provenance data model and management architecture implementation that suits the DataONE needs.

As part of this effort, we are designing the "Data One Provenance Model", or D-OPM (pun intended). Inspired by the OPM, D-OPM is focused specifically on workflow-based data products and their provenance. As such, the model explicitly includes a representation of the workflow-based processes that generate the provenance.

At the current state (June 2011), the WG has only met twice, and D-OPM is still in a preliminary state. However, interesting work was done by the WG in 2010 within the scope of the DataONE summer internship program, to define a model for composing provenance over multiple heterogeneous workflow runs and traces. A summary of this work appears in this published paper.

More to come as soon as the D-OPM model is finalised.

note from F2F1

    • Paolo M. can help connect and suggests to have materials ready for their yearly Oct 18/20 meeting