Ellison, A. M., L. J. Osterweil, J. L. Hadley, A. Wise, E. Boose, L. Clarke, D. R. Foster, A. Hanson, D. Jensen, P. Kuzeja, E. Riseman, and H. Schultz. 2006. Analytic webs support the synthesis of ecological datasets. Ecology 87: 1345-1358.


A wide variety of datasets produced by individual investigators are now synthesized to address ecological questions that span a range of spatial and temporal scales. It is important to facilitate such syntheses so that "consumers" of datasets can be confident that both input datasets and synthetic products are reliable. Necessary documentation to ensure the reliability and validation of datasets includes both familiar descriptive metadata and formal documentation of the scientific processes used (i.e., process metadata) to produce usable datasets from collections of raw data. Such documentation is complex and difficult to construct, so it is important to help "producers" create reliable datasets and to facilitate their creation of required metadata. We describe a formal representation - an analytic web - that aids both producers and consumers of datasets by providing complete and precise definitions of scientific processes used to process raw and derived datasets. The formalisms used to define analytic webs are adaptations of those used in software engineering, and they provide a novel and effective support system for both the synthesis and the validation of ecological datasets. We illustrate the utility of an analytic web as an aid to producing synthetic datasets through a worked example: the synthesis of long-term measurements of whole-ecosystem carbon exchange. Analytic webs are also useful validation aids for consumers because they support the concurrent construction of a complete, internet-accessible audit trail of the analytic processes used in the synthesis of the datasets. Finally we describe our early efforts to evaluate these ideas through the use of a prototype software tool, SciWalker. We indicate how this tool has been used to create analytic webs tailored to specific dataset synthesis and validation activities, and suggest extensions to it that will support additional forms of validation. The process metadata created by SciWalker is readily adapted for inclusion in Ecological Metadata Language (EML) files.

