Methods: Statistical Models of Carbon Exchange
We used statistical models to estimate whole-ecosystem carbon exchange and one of its major components, soil respiration. Data collected continuously during the summer and fall of 2002 in and above a relatively young deciduous forest and in and above a hemlock forest were modeled using multiple regression of commonly measured environmental factors, specifically photosynthetically active radiation above the canopy, above-canopy air temperature, soil temperature and soil moisture.
Two examples of statistical models are the following, for whole-forest carbon exchange during light and dark periods in June 2002:
Daytime: FCO2 = -3.87 + Tair*1.8217 + Tsoil*-1.2667 + PAR*-0.02478 + ln (PAR)*3.880 + Tair*PAR*0.000768 + Tair*ln(PAR)*-0.3395
Nighttime: FCO2 = 2.742*10^(0.015*Tsoil)
where FCO2 is whole-ecosystem carbon exchange, Tair is air temperature above the canopy, Tsoil is soil temperature at 10 cm depth, and PAR is photosynthetically active radiation above the canopy. Each of the models was developed from more than 150 half-hour measurements of CO2 flux above the canopy measured by the eddy covariance method, plus an equal number of half-hourly averages for Tair, Tsoil and PAR. Similar models were created for each month from July through December. Only environmental parameters or cross-products with a statistical significance less than or equal to 0.01 were used in the models; in the nighttime example above, only Tsoil met this condition.
A challenge inherent in these data is that some of the data are judged to be reliable for measuring of carbon flux, and some of the data are unreliable because of insufficient turbulence or other conditions that prevent adequate mixing of air from within and above the forest. Using the reliable data, a predictive model can be created based on environmental measurements, and this model is used to impute estimates of carbon-flux when data are unreliable.
Process Modeling
We used a prototype analytic web produced by SciWalker to first identify and separate reliable measurements of carbon exchange from unreliable measurements. The analytic web then applies an integrated model of the carbon exchange process, based on the two equations described above to the reliable data, and imputes data values to replace the unreliable data. Finally, the analytic web computes net (monthly or annual) carbon flux using statistical models applied to the combination of the reliable and imputed data. The same analytic web can be used to model data from other eddy covariance towers.
In these models, we distinguish between activity-centered process representations, such as our process language Little-JIL, and more familiar data-centered representations. The data-centered representation includes both a dataflow graph (type model) that describes the legal and expected relationships between the types of published artifacts in the web, and a data derivation graph that describes the actual web of interconnected artifact instances.
We have developed a more generic user interface for SciWalker to allow for the creation of analytic webs for other scientific processes. The interface and its associated analytic webs are based on data-centric process representations, constrained by activity-centric processes. Both processes are used in combination as a mechanism to select a relevant subset of the possible user activities and artifact instances.