Monthly Archives: March 2009

Below is the call for participation for the Third Provenance Challenge, which I’m helping to organize. If you have any questions about it, contact me. We are obviously looking for participation but it’s also interesting to just hear comments on the approach of having a common format for provenance.

The Third Provenance Challenge – Call for Participation

Data products are increasingly being produced by the composition of services and data supplied by multiple parties using a variety of data analysis, management, and collection technologies. This approach is particular evident in e-Science where scientists combine sensor data and shared Web-accessible databases using a variety of local and remote data analysis routines to produce experimental results. In such environments, provenance (also referred to as audit trail, lineage, and pedigree) plays a critical role as it enables users to understand, verify, reproduce, and ascertain the quality of data products. 

Because of the importance of provenance, many areas have developed techniques and tools for determining provenance including scientific and business process workflow, visualization, digital libraries and semantic web technologies. An important challenge in the context of heterogenous compositional applications, is how to integrate the provenance produced by these techniques to be able to construct the full provenance of complex data products. To that end, the community has endeavored to develop a common understanding and model of provenance to aid interoperability through the Open Provenance Model (OPM).

Help chart the future of provenance interoperability by participating in the Third Provenance Challenge.


You can find information on the challenge definition at how to participate at the Third Provenance Challenge Wiki.

To keep up-to-date, subscribe to the Provenance Challenge mailing list .

Key Dates:

  1. March 2 – The Third Provenance Challenge Starts
  2. Make the workflow work with individual team’s systems [Mar. 2 – Mar. 30]
  3. Generate provenance for the challenge workflow & run queries on it [Mar. 30 – Apr. 13]
  4. Export OPM Graphs and import from others [Apr. 13 – May. 4]
  5. Run queries on imported OPM graph [Apr 27. – Jun. 1]
  6. Prepare slides for challenge [Jun. 1 – Jun. 8]
  7. PC3 Workshop June 10 – 11 held in Amsterdam.


For details or questions, contact Paul Groth (pgroth  -at-


  • Paul Groth, ISI / University of Southern California
  • Yogesh Simmhan, Microsoft Research
  • Luc Moreau, University of Southampton

Local Organizers:

  • Adam Belloum, University of Amsterdam
  • Zhiming Zhao, University of Amsterdam


Starting with the 2006 International Provenance and Annotation Workshop (IPAW), the community agreed to hold the First Provenance Challenge that emphasized understanding the commonalities and differences between existing approaches. Held in Washington DC on September 2006, the 17 team workshop identified several commonalities and resulted in agreement that a Second Provenance Challenge focusing on interoperability would be beneficial. At the Second Provenance Challenge workshop held at the High Performance Distributed Computing conference on June 26, 2007, teams presented their results demonstrating the ability to interoperate between several systems. Discussions at this challenge led to the specification of a common data model, The Open Provenance Model (OPM). This model was further discussed and developed at a subsequent workshop held at IPAW’08. Discussions at this workshop led to this Third Provenance Challenge focusing on interoperability using OPM. More information can be found at

It’s been awhile since I’ve posted…. I still owe a post on the rest of Borgman’s book. I’ve been a bit side tracked. I got engaged a couple of weeks ago and I started reading another fascinating book on the role of paper in knowledge work. 

I’ll be in Germany and the Netherlands next week (Mar. 8 – 17). If you’re interested in meeting up send me an email.

%d bloggers like this: