Use Case Fulfilling Contractual Obligations

From XG Provenance Wiki
Revision as of 23:54, 21 January 2010 by Dmcguinn (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Fulfilling Contractual Obligations


Jim Myers

Provenance Dimensions

Primary: Accountability, Interoperability Secondary: Attribution, Process, Entailment, Understanding, Imperfections

Background and Current Practice

In scientific collaborations and in business, individual entities often enter into some form of contract to provide specific services and/or to follow certain procedures as part of the overall effort. Proof that work was performed in conformance with the contract (expectations) of the project leadership is often required in order to receive payment and/or to settle disputes. Such proof consists of basic information about what was done by whom as well as identifiation of the witnesses asserting the information is true and information such as signatures (written or digital) that make it suitable as a legal record. Documentation today includes laboratory notebooks, invoices detailing proceedures performed, shipping receipts, etc.


Provide strong proof that work was performed that meets contractual requirements. Such proof must:

  • document work that was performed on specific items (samples, artifacts)
  • provide a variety of evidence that would preclude various types of fraud,
  • allows combination of evidence from multiple witnesses, and
  • be robust to providing partial information, eg.providing information limited to that required to address contractual concerns to, for example, protect privacy or trade secrets.

Use Case Scenario

An organisation agrees to perform a process, under a set of requirements on how that process should be performed. The organisation is later asked to provide proof that no obligation or prohibition was violated, deliberately or accidentally, by action or omission. The specific motivating case from which this general scenario derives is as follows.

Foo Corp. accepts a contract to perform an analysis of the contents of several chemical samples provided by Bar Corp. as part of their effort to meet government safety regulations. The contract specifies how the samples are to be handled and requires the use of a technique validated for the class of chemicals involved. When the results indicate contamination that may force a broad recall, Bar Corp. sues Foo Corp. claiming error in their processing. Foo Corp. has not done anything wrong but needs to defend itself against several claims:

  • That the work was never done
  • That the work was done by an untrained technician
  • That equipment was improperly calibrated
  • That one or a few Foo Corp. technicians tampered with the samples
  • That samples were left at room tmperature too long
  • That samples were accidentally or intentionally swapped during a transfer beween processing steps
  • That records were tampered with to remove evidence of improper work

Problems and Limitations

Foo Corp. has significant experience working in a regulated industry and has purchased equipment that produces exportable provenance and has electronic systems for employees to record their work, as well as internal process checks performed by employees who do not know the identity of samples. It also has electronic records documenting employee training, instrument calibration, etc.

In responding to the Bar Corp. suit, Foo has several concerns:

  • Samples from multiple companies are processed in the same runs and the information about those samples and even the identity of those companies should remain private. However, the fact that samples from those companies do not show evidence of contaminants is useful evidence.
  • Foo has developed enhanced techniques that it feels reduce the error in its analyses and that it wants to keep secret. For example, Foo Corp. is able to keep sample temperature constant to a fraction of degree but only wishes to prove that it maintained the sample temperature to within the 2 degree range specified in the contract.

In developing its system, Foo had to find a provenance, metadata, and records management solution that addressed numerous challenges:

  • The provenance of a given sample has to be assembled from records provided by multiple independent systems that have their own internal IDs tht have to be matched to the global sample IDs. (A similar issue exists in that one of the instruments has only internal IDs for user accounts.)
  • The required evidence involves finding the provenance of 'related' samples (those processed the same day, on the same instrument, by the same technician, of the same chemical type, etc.) and showing that results across these collections are consistent (an instrument processed othe samples in the same batch correctly and w/o contamination, a technician was performing analyses all day with only one at a time and no odd gaps in their work schedule)
  • Each account of processing is signed and dated as close to the source as possible and with minimal delay. (Foo Corp. primarily uses instruments that produce cryptographically signed provenance statement directly using an internal certificate, others send unsigned statements directly to a central server for signing, all records are give a signed timestamp by a third-party clock (a notary service) as early as possible)
  • The system is capable of providing 'derived' signed records that include provenance for any of the 'related' sets of samples required (see above) and further to do so
    • while anonymizing some samples,
    • producing summary information, e.g. a signed statement that temperature remained constant within 2 degrees across all processing steps in the record, that hides detail without simply removing metadata of a certain type, and
    • producing less granular records that can be understood by humans (e.g. reducing gigabytes+ of provenance to a printable summary
  • The system is capable of providing raw provenance records to a trusted third party to generate derived anonymize and summarized records and documenting that it is the third party who is responsible/liable for asserting that the derived records are valid (correctly reflect the content of the oiginal records).

This scenario highlights a few related issues that appear across other scenarios:

  • provenance subsystems often have different identifier schemes and end-to-end provenance management will require means to assert known aliases/correspondences.
    • The relations may not be true aliases, e.g. they could be subpart relationships (e.g. only a part of a sample is run through an instrument). Provenance systems may also need to document when measurements on one thing reflect a property of another (e.g. the chemical composition of the subsample is assumed to be that of the whole sample, we expect the type of person's blood to match that of a sample of their blood, for temperaure at a thermostat to reflect that in the room, etc.)
  • the concerns of witnesses and end-users of provenance information have different perspectives and it is important that provenance systems be able to combine accounts, extract subaccounts, and shift to different levels of granularity. Managing aliases, synchronization of witness clocks, mapping across part-of, type-of relationships are all required.
    • Trusted third parties may play an important role and the concept of an 'interpreter' (or 'judge') may be needed along with that of 'witness' to describe how 'derived' records are created.
  • Records-related information (direct signatures, signatures notarizing and timestamping other signatures, mechanisms that provide evidence of completeness (e.g. numbering pages in a bound notebook) will need to be maintained for provenance and propagated to derived records to create chains of evidence.

Unanticipated Uses (optional)

The use case as presented is business-oriented and legal defense. Analogous cases could be ceated for an academic setting and defense of work a part of peer review, ethics inquiries, etc.

Existing Work (optional)

There is work across electronic records, e-notbooks, LIMS and asset management systems, workflow, e-Science, and semantic web commuities that address parts of this scenario. I've drawn from experience as part of the Collaborative Electronic Notebook Systems Association (, 1998-2008) where many requirements for documentation of scientific research and analyical sample processing in the Chemical and Pharmaceutical industries were discussed in the context of FDA regulatons, patent policies, and rules of legal evidence.