W3C Provenance Incubator Group Wiki
Welcome to the Provenance Incubator Group Wiki.
Mission and Charter
The mission of the Provenance Incubator Group, part of the Incubator Activity, was to provide a state-of-the art understanding and develop a roadmap in the area of provenance for Semantic Web technologies, development, and possible standardization. See the charter for more information. The group's activities were public, and recorded on the W3C Provenance Incubator Group wiki.
|Final Report, December 2010|
|Follow on Provenance Working Group, April 2011|
|"At the toolbar (menu, whatever) associated with a document there is a button marked "Oh, yeah?". You press it when you lose that feeling of trust. It says to the Web, 'so how do I know I can trust this information?'. The software then goes directly or indirectly back to metainformation about the document, which suggests a number of reasons."||Tim Berners-Lee, W3C Chair, Web Design Issues, September 1997|
|"Provenance is the number one issue we face when publishing government data as linked data for data.gov.uk"||John Sheridan, UK National Archives, data.gov.uk, February 2010|
|"We need a paradigm that makes it simple [...] to perform and publish reproducible computational research. [...] A Reproducible Research Environment (RRE) [...] provides computational tools together with the ability to automatically track the provenance of data, analyses, and results and to package them (or pointers to persistent versions of them) for redistribution."||Jill Mesirov, Chief Informatics Officer of the MIT/Harvard Broad Institute, in Science, January 2010|
|"The number of publications on provenance is [...] a total of 425 [...] The first publication dates back to 1986, [...] with about half the papers published in the last two years."||Luc Moreau, University of Southampton, in The Foundations of Provenance on the Web, November, 2009|
|"The problem is - and this is true of books and every other medium - we don't know whether the information we find [on the Web] is accurate or not. We don't necessarily know what its provenance is. So we have to teach people how to assess what they've found. [...] there's so much juxtaposition of the good stuff and not-so-good stuff and flat-out-wrong stuff or deliberate misinformation or plain ignorance."||Vinton Cerf, Internet pioneer, in Smithsonian's "40 Things you need to know about the next 40 years" issue, July, 2010|
|"In content, as creation becomes overabundant and as value shifts from creator to curator, it becomes all the more vital to properly cite and link to sources [...]. Good curation demands good provenance. [...] Provenance is no longer merely the nicety of artists, academics, and wine makers. It is an ethic we expect."||Jeff Jarvis, media company consultant and associate professor at the City University of New York's Graduate School of Journalism, in The importance of provenance on his BuzzMachine blog, June, 2010|
|Provenance of a resource is a record that describes entities and processes involved in producing and delivering or otherwise influencing that resource. Provenance provides a critical foundation for assessing authenticity, enabling trust, and allowing reproducibility. Provenance assertions are a form of contextual metadata and can themselves become important records with their own provenance.|
|What is Provenance?|
|A summary of the group's findings and recommendations can be found in the Final Report and in this this slide presentation.|
The first phase of the group's activities focused on requirements for provenance, categorizing and describing requirements based on use cases. This phase resulted in the following report:
- Report on "Requirements for Provenance on the Web", released April 9, 2010.
Based on the use cases raised in this report we submitted a paper as a group effort to the RDF Next Steps workshop:
- Report on Provenance Requirements for the Next Version of RDF, posted May 19, 2010. See also the slides from the presentation.
The group analyzed and compared current proposals for representing provenance on the Web:
- Report on "Provenance Vocabulary Mappings", released August 6, 2010.
The group also assembled a report on the state of the art in provenance, highlighting existing approaches and technology gaps:
- Report on the "State of the Art on Provenance", released October 20, 2010.
The draft of the final report of the group contained a summary of all the group's findings, a roadmap, and recommendations:
- Draft of final report of the W3C Provenance Incubator Group, released November 30, 2010.
The official W3C final report:
- Final report of the W3C Provenance Incubator Group, released December 14, 2010.
Timeline of Activities
See the overview of released reports for major products of the group's work.
- 2010-10-29: Published a presentation with motivation and activities of the Provenance Incubator Group
- 2010-10-29: Started group discussions on Provenance and the Web architecture
- 2010-10-15: Started drafting a final report
- 2010-10-15: Released a tagged bibliography collection
- 2010-10-1: Started to work on broad recommendations and priorities for a roadmap
- 2010-09-24: Agreed to a working definition of provenance
- 2010-08-31: Started State of the Art Report about Provenance on the Web
- 2010-08-06: Release of group report on Provenance Vocabulary Mappings
- 2010-06-26: A paper from the group was presented at the RDF Next Steps workshop titled "Provenance Requirements for the Next Version of RDF".
- 2010-06-18: Started in-depth analysis of Disease Outbreak scenario
- 2010-06-17: Started in-depth analysis of Business Contract scenario
- 2010-04-26: Started in-depth analysis of News Aggregator scenario
- 2010-04-26: Started defining mappings across existing provenance vocabularies
- 2010-04-25: First face to face meeting
- 2010-04-14: Announcement of the group's report for broad distribution on "Requirements for Provenance on the Web"
- 2010-03-12: Started discussing a working definition of provenance
- 2010-02-12: Started a series of presentations on state of the art work on provenance
- 2010-02-05: Collected extensive user and technical requirements for provenance
- 2010-01-12: Collated the use cases and the classification applied to them in a single report
- 2010-01-06: Designed an organization of use cases to illustrate requirements for provenance
- 2009-12-31: 30 use cases have been contributed by group participants
- 2009-11-20: Proposed a set of provenance dimensions that capture major issues in this area
- 2009-11-13: Started to collect proposed use cases
- 2009-11-06: Agreed to a use case template to describe use cases of provenance
- 2009-10-30: Started compiling overviews of provenance work, relevant technologies, and related events
- 2009-10-30: First of the group's weekly telecons
- 2009-09-21: The W3C Provenance Incubator Group begins activities with this charter
See the group's planned timeline of activities.
Meetings and Discussions
- Weekly telecon information
- Past telecon agendas, minutes, and action items
- Mail List Archive
- Face to face meetings
- Tracker items - tracking action items for the group
The official tag for the group is #prov-xg. This should be used across social sites. You can find the latest tweets here.
Liaisons with Other Groups
- W3C Health Care and Life Sciences Interest Group: Several members of HCLS are also active participants of the Provenance XG. The following people have agreed to be official liaisons:
- W3C eGovernment Interest Group: The following people have agreed to be official liaisons:
- Dublin Core Metadata Initiative: The following people have agreed to be official liaisons:
- Social Web Incubator Group:
- Paul Groth is acting as the liaison.
- Information related to this cooperation can be found on the Social Web page.
If you would be interested in being a liaison with other groups please contact Yolanda Gil, the Provenance XG Group Chair.