You are here
Available data files, software, and code (2nd edition)
N.J. Gotelli & A.M. Ellison (2012) A primer of ecological statistics, 2nd edition. Sinauer Associates, Sunderland, Massachusetts.
Data files, software and code
Update history:
- 1 December 2012 - created (AME)
Notes:
- The book can be ordered directly from Sinauer Associates.
- The S-plus code from the first edition of the Primer is no longer being maintained, but can be found here.
- Data are in space-delimited ASCII text, and code is provided either as "script" files (.R) that will run in R or ASCII text files that can be imported into and run with WinBUGS version 1.4. The code files (.txt or .R) can be opened and read with any text editor (e.g., NotePad, WordPad, Emacs, VI).
- Errata are also available.
Please let us know if you are using the Primer or these data for teaching purposes!
Chapter 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | Literature Cited
- Tibial spine data (Table 3.1). These are simulated, not actual, data.
- R script for illustrating the Law of Large Numbers and frequentist confidence intervals. The code is modified from that provided by Blume & Royall (2003). The modifications simply make it "generic"; their published code was specific to their published example.
[Back to Top]
- Photosynthetic rates of 15 mangrove leaves. These are part of a larger dataset published by Farnsworth and Ellison (1996).
[Back to Top]
- Ant nest data (Table 5.1) used for illustrating simple frequentist, Monte Carlo, and Bayesian one-way ANOVA. These are simulated, not actual, data.
- R code for the Monte Carlo analysis of the ant nest data.
- WinBUGS code for the Bayesian analysis of the ant nest data. These analyses are illustrated in Figures 5.6-5.9 and Table 5.7.
[Back to Top]
- Morphological measurements of 25 Darlingtonia californica pitchers with three added outliers (Table 8.1). These unpublished data were collected by Aaron Ellison, Rebecca Emerson, and Hedda Steinhoff in July 2000, and should not be used in a publication without permission.
- Plant species richness and island area for 17 Galápagos Islands (Table 8.2). The data provided here were originally published in Preston (1962). We retain the island names given by Preston, but have converted island area from square miles to square kilometers.
[Back to Top]
- Plant species richness and island area for 17 Galápagos Islands plus the added outlier (used for Figure 9.9). These are the same data used in Chapter 8, with the addition of an artificial outlier in row 16.
- R script for the Monte Carlo analysis of the slopes of the log-log transformed species-area relationship. The results of this analysis are illustrated in Figure 9.8.
- WingBUGS code for the Bayesian analysis of the Galapagos species-area relationship.
- Acorn data used to illustrate quantile regression (Figure 9.10). These data were published by Schroeder and Vangilder (1997), and discussed further by Cade et al. (1999). Data were kindly provided by Brian Cade.
- The Darlingtonia data used to illustrate logistic regression (Figure 9.11). These data are part of a larger study published by Dixon et al. (2005).
- Ant species density in forest plots in New England used to illustrate multiple regression and collinearity (Figures 9.12-9.14). The data were published by Gotelli & Ellison (2002).
- The data used for the path analysis (Figure 9.15). The full analysis of these data were published by Gotelli and Ellison (2006).
[Back to Top]
- Growth of mangrove roots with living or artificial sponges, used to illustrate a priori contrasts and a posteriori multiple comparisons among means (Tables 10.12 - 10.15, and Figure 10.5). These data are a subset of a larger dataset published by Ellison et al. (1996).
- Download the complete dataset, from the randomized block design described by Ellison et al. (1996).
[Back to Top]
- Frequencies of rare plant populations that are declining or not; invaded or not; protected or not; and ordinal light level at each population. Species identities are not given to protect these plants. The data were published by Farnsworth (2004), and are based on compilations from Conservation and Research Plans developed by the New England Wild Flower Society.
[Back to Top]
- Morphology, mass, and nutrient content of Darlingtonia californica, used for multivariate analyses described in Tables 12.1 - 12.3, Tables 12.7 and 12.8, and Figures 12.2, 12.3, 12.5-8. These data are part of a larger dataset described by Ellison and Farnsworth (2005).
- R script for testing multivariate normality. This code is based on algorithms provided by Doornik and Hansen (1994, 2008).
- Although it's not used in the book, Fisher's iris data is a common dataset used for multivariate analyses. Doornik & Hansen (1994, 2008) benchmark their test for multivariate normality on a subset of Fisher's iris data - the data for I. setosa. This version of Fisher's iris data was copied from The Data and Story Library. It is also included in the Modern Applied Statistics with S (MASS) library of R (Venables & Ripley 2002).
- Ant presence-absence data used for Principal Coordinates Analysis, Correspondence Analysis, and non-metric multidimensional scaling (Tables 12.9 - 12.10; Figures 12.9 - 12.12). These data were aggregated from data published by Gotelli and Ellison (2002) and Ellison et al. (2002).
- Snail shell data used for cluster analysis and redundancy analysis
- Reduced dataset (Table 12.11) used for the examples in the book (Table 12.12 - 12.14, Figure 12.13 - 12.16).
- The full dataset that was used by Merkt and Ellison (1998). Thanks to Ontrack Data Recovery, these data were recovered in January 2006 from a tape backup made in 1997. This was a good lesson in the importance of keeping students' lab notebooks and maintaining copies of datasets on paper, and the need for timely transfers of files from obsolete to new media.
[Back to Top]
- Basic rarefaction functions. NOTE: This file needs to be "sourced" in R before running additional R scripts for this chapter.
- Spider diversity data and analysis (Tables 13.1, 13.2, 13.4; Figures 13.1 - 13.3, 13.7). Additional analyses of these data can be found in Sackett et al. (2011).
- Complete spider diversity dataset, as illustrated in Table 13.1. This file is in "long" form, with each row being a single observation on a specific date.
- Spider data in matrix format - species [rows] x treatments [columns] - for use with rarefaction script. You can also generate this file by running this R script to convert the long-form data to the matrix format.
- R script for plotting individual-based rarefaction curves. This script will generate an example column for a random subsample (as in Table 13.2); the asymptotic species-richness estimators and their confidence intervals (Table 13.4); Figure 13.1 (histogram of species richness counts for 1000 random subsamples); Figure 13.2 (individual-based rarefaction curve for the logged treatment); and Figure 13.3 (individual-based rarefaction curves for all treatments).
- Individual-based rarefaction plots of spider data in the hardwood treatment for Hill numbers q = {1, 1, 2, 3} (Figure 13.7):
- R script to calculate values for different Hill numbers (be aware of hard-coding for q in the code!);
- Expected number of species, for different samples (i.e., species accumulation curve), with confidence bounds, for q = 0;
- Expected number of species, for different samples (i.e., species accumulation curve), with confidence bounds, for q = 1;
- Expected number of species, for different samples (i.e., species accumulation curve), with confidence bounds, for q = 2;
- Expected number of species, for different samples (i.e., species accumulation curve), with confidence bounds, for q = 3;
- R script to plot Figure 13.7.
- Ant diversity data and analysis (Table 13.3, 13.5; Figures 13.4 - 13.6).
- Complete ant diversity dataset, in "long" form. This file includes abundances of ants collected in multiple sites within three habitats in Massachusetts: cultural grasslands, oak-hickory-white pine forests, and successional shrublands.
- Species (rows) x sites (columns) of each habitat (you can also generate these three files by running this R script to convert the long-form data to the matrix format):
- Cultural grassland matrix (as illustrated in Table 13.3);
- Oak-hickory-white pine matrix;
- Successional shrubland matrix.
- R script for plotting sample-based rarefaction curves. This script calls the three species x sites matrices and generates: a single sample-based rarefaction curve (Figure 13.4); the three sample-based rarefaction curves rescaled to the number of incividual ant nests per sample (Figure 13.5); and the data needed to recreate Table 13.5 (asymptotic species-richness estimators and their confidence intervals).
- Input files for Figure 13.6:
- Expected number of species, for different samples (i.e., species accumulation curve), with confidence bounds, for Cultural grassland data;
- Expected number of species, for different samples (i.e., species accumulation curve), with confidence bounds, for Oak-hickory-white pine data;
- Expected number of species, for different samples (i.e., species accumulation curve), with confidence bounds, for Successional shrubland data.
- R script for plotting interpolated and extrapolated sample-based rarefaction curves of the ant data.
- An alternative set of scripts for Figure 13.6:
- First, load this data file (site x habitat format, without species labels);
- Next, source these subroutines for sample-based rarefaction (code provided by Anne Chao);
- Then, run sample-based rarefaction routines (code provided by Anne Chao);
- Finally, graph the output using this R script.
[Back to Top]
- Analysis of bog ant species abundances (Tables 14.1 - 14.6; Figures 14.2 - 14.5)
- Collection histories of Dolichoderus pustulatus
- Occurrence data (presence/absence) - data as presented in Table 14.1 and plotted in Figure 14.2
- Abundance data (number of pitfalls)
- R script for analysis of occurrence data for estimates of occupancy and detection probability (Table 14.2)
- Complete dataset of ant species in New England bogs (as described in Table 14.3)
- R script for calculating asymptotic Chao2 estimators and confidence intervals of bog ant data (plotted in Figure 14.3)
- Hierarchical model (Tables 14.3 - 14.6) :
- Ant detection data (Table 14.3);
- Site covariates data;
- Calculations of local species richness:
- Source files required: GetDetectionMatrix.R and GetSiteCovariates.R;
- R script (includes JAGS code) (from Supplementary Material of Dorazio et al. 2011) for hierarchical modeling.
- Summary file including Chao2 estimators and confidence intervals along with median estimates and 95% credibility intervals from hierarchical model (as plotted in Figures 14.4 and 14.5 using this R script).
- Occupancy modeling of hemlock woolly adelgid populations in central and western Massachusetts (Tables 14.7 - 14.8; Figure 14.7)
- Complete data file (extracted in Table 14.7);
- R script for multi-season occupancy model of hemlock woolly adelgid data (Table 14.8, Figure 14.7).
- Mark-recapture modeling of Lady's slipper orchids at the Harvard Forest (Tables 14.9 - 14.12; Figure 14.8)
- Locations of orchids encountered on three sampling dates (plotted in Figure 14.8 using this R script);
- Capture histories of orchids (Tables 14.9, 14.10);
- Input file (binary) for mark-recapture analysis of the orchid data using program MARK (Tables 14.11, 14.12).
- Blume, J. D., and R. M. Royall. 2003. Illustrating the Law of Large Numbers (and confidence intervals). American Statistician 57: 51-57.
- Cade, B. S., J. W. Terrell, and R. L. Schroeder. 1999. Estimating effects of limiting factors with regression quantiles. Ecology 80: 311-323.
- Dixon, P. M., A. M. Ellison, and N. J. Gotelli. 2005. Improving the precision of estimates of the frequency of rare events. Ecology 86: 1114-1123.
- Dorazio, R. M., N. J. Gotelli, and A. M. Ellison. 2011. Modern methods of estimating biodiversity from presence-absence surveys. Pages 277-302 in: G. Venora, O. Grillo, and J Lopez-Pujol, editors. Biodiversity loss in a changing planet. InTech - Open Access Publisher, Croatia.
- Doornik, J. A., and H. Hansen. 1994. An omnibus test for univariate and multivariate normality. Working paper, Nuffield College, Oxford University.
- Doornik, J. A. and H. Hansen. 2008. An omnibus test for univariate and multivariate normality. Oxford Bulletin of Economics and Statistics 70 (s1): 927-939.
- Ellison, A. M., and E. J. Farnsworth. 2005. The cost of carnivory for Darlingtonia californica (Sarraceniaceae): evidence from relationships among leaf traits. American Journal of Botany 92: 1085-1093.
- Ellison, A. M., E. J. Farnsworth & N. J. Gotelli. 2002. Ant diversity in pitcher-plant bogs of Massachusetts. Northeastern Naturalist 9: 267-284.
- Ellison, A. M., E. J. Farnsworth, and R. R. Twilley. 1996. Facultative mutualism between red mangroves and root-fouling sponges in Belizean mangal. Ecology 77: 2431-2444.
- Farnsworth, E. J. 2004. Patterns of plant invasion at sites with rare plant species throughout New England. Rhodora 106: 97-117.
- Farnsworth, E. J., and A. M. Ellison. 1996. Sun-shade adaptability of the red mangrove, Rhizophora mangle (Rhizophoraceae): changes through ontogeny at several levels of biological organization. American Journal of Botany 83: 1131-1143.
- Gotelli, N. J., and A. M. Ellison. 2002. Biogeography at a regional scale: determinants of ant species density in New England bogs and forest. Ecology 83: 1604-1609.
- Gotelli, N.J., and A.M. Ellison. 2006. Food-web models predict species abundance in response to habitat change. PLoS Biology 44(10): e324.
- Merkt, R. E. & A. M. Ellison. 1998. Geographic and habitat-specific morphological variation of Littoraria (Littorinopsis) angulifera (Lamarck, 1822). Malacologia 40: 279-295.
- Preston, F. W. 1962. The canonical distribution of commonness and rarity: Part I. Ecology 43: 185-215.
- Sackett, T. E., S. Record, S. Bewick, B. Baiser, N. J. Sanders, and A. M. Ellison. 2011. Response of macroarthropod assemblages to the loss of hemlock (Tsuga canadensis), a foundation species. Ecosphere 2: art74.
- Schroeder, R.L., and L.D. Vangilder. 1997. Tests of wildlife habitat models to evaluate oak mast production. Wildlife Society Bulletin 25: 639-646.
- Venables, W. N., and B. D. Ripley. 2002. Modern applied statistics with S, 4th edition. Springer-Verlag, New York.
[Back to Top]