You are here

Bullard Spotlight: Barbara Lerner and Scientific Data Provenance

May 7, 2014
Printer-friendly version
Lerner in the forest

Each month, we feature research by one of Harvard Forest's Charles Bullard Fellows. This month, we're highlighting Barbara Lerner, Associate Professor of Computer Science at Mount Holyoke College.

During her year-long Bullard Fellowship, Lerner has been working closely with HF information manager Emery Boose to record information about how scientific data is processed from the point of collection to the point of dissemination. This entails collecting data provenance information from scripts in the statistics program R, widely used by scientists to analyze their data.

To express the importance of this work, Lerner makes a comparison to the world of art: "Much as provenance in the art world helps authenticate the legitimacy of a piece of artwork, so, too, should scientific data provenance help authenticate the legitimacy of a scientific dataset or results derived from a dataset. Beyond this, scientific data provenance helps scientists understand their work and results, replicate the work of others, compare not just results but also scientific processes, and ensure the correctness of their data collection and analysis processes."

Lerner initially made a connection to the Harvard Forest as a mentor in the Harvard Forest Summer Research Program, which she has continued for the past five summers. She explains, "Coming from an undergraduate college, it is difficult to find time for research during the academic year. The Bullard Fellowship has allowed me to focus on this research and make great progress."

Lerner and Boose hope to release software soon for scientists analyzing data in R. They will also present a paper at TAPP 2014 and a poster at IPAW 2014, and are working on a book chapter for a book on replication in science edited by HF ecologist Aaron Ellison and Ayelet Shavit.

Content Tags: