*Ecological Archives* M085-006-A3

Jacob E. Allgeier, Craig A. Layman, Peter J. Mumby, and Amy D. Rosemond. 2015. Biogeochemical implications of biodiversity and community structure across multiple coastal ecosystems. *Ecological Monographs* 85:117–132. http://dx.doi.org/10.1890/14-0331.1

Appendix C. Methodological details for hierarchical models.

**Hierarchical Mixed Effects Models**

We used hierarchical mixed effects models to explore the relationship between community assembly and the aggregate supply and storage of nutrients, and multifunctionality (*M*). To do so we ran six separate models, one for each of the 5 ecosystem processes of interest and *M,* on 82 independent locations, averaged from 172 fish communities. All models included the same six parameters: Species Richness (Richness), Species Diversity (SD), Functional Group Diversity (FGD), mean Trophic Level (mean TL), mean Maximum Size of each species within the community (L_{max}) calculated following Nicholson and Jennings (2004) and skewness of the size frequency distribution of the community (S_{size}). Richness was a simple measure of the number of species within a community. SD and FGD were both measured by the reciprocal Simpsons' Diversity Index (SD = 1 / Σ( P_{i}² ), where P_{i} is the abundance of species *i *divided by the total richness at that site) (Simpson 1949) at the species level and functional group level, respectively, whereby the greater the number, the higher the diversity or evenness within the community. Functional group classifications were based on discrete trophic delineations following Newman et al. (2006), i.e., piscivore, piscivore-invertivore, macroinvertivore, microinvertivore, herbivore, omnivore, planktivore). Much ecological research has relied on classifications based on discrete trophic levels, and while recently developed continuous measures have merit (Naeem and J. P. Wright 2003), we chose this more traditional metric as it has proven to be a useful ecological level of organization in previous research (Naeem and J. P. Wright 2003, Floeter et al. 2004, Micheli and Halpern 2005). Mean TL and L_{max} were calculated following Nicholson et al. (2004) and trophic level values from Harborne et al. (2008). S_{size} was calculated by determining the skewness of the size frequency distribution of the community, whereby the further the value deviates from zero, either positively or negatively, the more small or large individuals dominate the community, respectively. There were six response variables of interest: N and P supply, C, N, and P storage, and *M. *In all cases, the response variable represented an aggregate value of all species contributions within a given fish community.

We modeled data from 172 communities across 82 sites within 6 different ecosystem types (*Acropora* reef, Gorgonian Plains, Mangroves, *Montastraea* reef, Patch Reef, Seagrass) across 7 different islands in The Bahamas and Turks and Caicos. Community estimates consisted of multiple, typically 8–10 transects, which were averaged per area following Mumby et al. 2006 and Harborne et al. 2008. Communities were then averaged at the site level (*n* = 82). Site and ecosystem type were held as random effects in all models to control for the confounding effects that may be present due to site or habitat differences. In all cases random effects for the intercept only, random slope only, or random intercept + slope models were tested and the random intercept model was ultimately selected for using Akaike's information criterion (Burnham and Anderson 2002). Models were run using the "lme4" package in R(R Core Development Team 2012). All response variables, as well as Richness, SD, FGD, and L_{max} were log transformed and in all cases model assumptions of normality and homogeneity of variance were met. Because the calculations for SD inherently includes Richness: SD = 1 / sum ( P_{i}² ), where Pi is the abundance of species *i *divided by the total richness at that site, collinearity might be expected. However, these variables were never correlated more than *r* = 0.51, satisfying standard permissibility of collinearity (Gelman and Hill 2007). We additionally tested for collinearity by calculating variance inflation factors, a simple diagnostic for collinearity, for each model. In all cases, the models meet proper assumptions (Heiberger and Holland 2003).

We further tested possibilities of relationships in the data structure that may confound our overall findings and found no significant relationships. For example, there was no significant relationship between the total area surveyed per site (1020–2800 m²) and any predictor variable (*p *value > 0.1).

To examine the relative importance of the different predictor variables for ecosystem processes, we used a multi-model inference approach (Burnham and Anderson 2002, Johnson and Omland 2004). This approach uses information theory to assess the probability that a given model most appropriately describes the data (Burnham and Anderson 2002, Johnson and Omland 2004). We calculated AICc, a value that corrects for the number of terms in the model, whereby the lowest AICc value, constitutes the model with the best fit to the data (Burnham and Anderson 2002, Johnson and Omland 2004). For each model we also calculated the ΔAICc, representing the difference in AICc between each model. Values above seven indicate that a model has a poor fit relative to the best model and values below two indicate that models are indistinguishable (Burnham and Anderson 2002, Johnson and Omland 2004). Wealso calculated Akaike weights (w_{i}), a parameter that provides further evidence for the best explanatory model (Burnham and Anderson 2002) (Table 1).

*M and simulation models*

The multifunctionality metric (*M*) was also applied in our simulation models by quantifying a *z* score for each species within a given community (as determined from survey data). Because this metric centers values around zero (and thus creates negative values), and we were specifically interested in using this metric for quantifying aggregate community effects, we added 3 (a value that is greater than the lowest negative value) to each z score. In doing so, we did not alter the net value or distribution of the metric, but instead simply shifted all values to be centered around 3.

Literature cited

Burnham, K. P., and D. R. Anderson. 2002. Model selection and multimodel inference: a practical information theoretic approach. Page 488. Second Edition. Springer-Verlag, New York, New York, USA.

Floeter, S. R., C. E. L. Ferreira, A. Dominici-Arosemena, and I. R. Zalmon. 2004. Latitudinal gradients in Atlantic reef fish communities: trophic structure and spatial use patterns. Journal of Fish Biology 64:1680–1699.

Gelman, A., and J. Hill. 2007. Data Analysis Using Regression. Cambridge University Press, New York, New York, USA.

Harborne, A. R., P. J. Mumby, C. V Kappel, C. P. Dahlgren, F. Micheli, K. E. Holmes, J. N. Sanchirico, K. Broad, I. A. Elliott, and D. R. Brumbaugh. 2008. Reserve effects and natural variation in coral reef communities. Journal of Applied Ecology 45:1010–1018.

Heiberger, R. M., and B. Holland. 2003. Statistical Analysis and Data Display: An Intermediate Course with Examples in S-plus, R, SAS . Springer Science, New York, New York, USA.

Johnson, J. B., and K. S. Omland. 2004. Model selection in ecology and evolution. Trends in Ecology & Evolution 19:101–108.

Micheli, F., and B. S. Halpern. 2005. Low functional redundancy in coastal marine assemblages. Ecology Letters 8:391–400.

Mumby, P. J., C. P. Dahlgren, A. R. Harborne, C. V Kappel, F. Micheli, D. R. Brumbaugh, K. E. Holmes, J. M. Mendes, K. Broad, J. N. Sanchirico, K. Buch, S. Box, R. W. Stoffle, and A. B. Gill. 2006. Fishing, trophic cascades, and the process of grazing on coral reefs. Science 311:98–101.

Naeem, S., and J. P. Wright. 2003. Disentangling biodiversity effects on ecosytem functioning: deriving solutions to a seemingly insurmountable problem. Ecology Letters 6:567–579.

Newman, M. J. H., G. A. Paredes, E. Sala, and J. B. C. Jackson. 2006. Structure of Caribbean coral reef communities across a large gradient of fish biomass. Ecology Letters 9:1216–1227.

Nicholson, M. D., and S. Jennings. 2004. Testing candidate indicators to support ecosystem-based management: the power of monitoring surveys to detect temporal trends in fish community metrics. Ices Journal of Marine Science 61:35–42.

R Core Development Team. 2012. R: A language and environment for statistical computing. http://www.r-project.org.

Simpson, E. H. 1949. Measurement of diversity. Nature 163:688.