Ecological Archives E086-062-A3

Daniel F. Doak, Kevin Gross, and William F. Morris. 2005. Understanding and predicting the effects of sparse data on demographic analyses. Ecology 86:1154–1163.

Appendix C. An approximation for sampling variance in stochastic growth rate estimates.

Click here for a pdf version of this appendix (please note that Eqs. A.1 through A.11 of the pdf file should read as C.1 through C.11, respectively).

Our approximation for the sampling variation in population growth estimates is based on Tuljapurkar’s approximation for log s, written in terms of vital rates and expressed with correlations and standard deviations, rather than the usual covariances (e.g., Caswell 2001):

(C.1)

where the summations are across all vital rates. To approximate the variance in this quantity, we start with a simple delta approximation (a first order Taylor Expansion: Oehlert 1992), taking log s as a function of all the vital rate means, , standard deviations, , and correlations, . Note that while the vital rate means do not appear explicitly in Eq. C.1, they exert their influence through the values of and the sensitivities in the double summation. This approximation results in Eq. 2 in the main text, into which are substituted the sensitivities of log s to each vital rate mean and the variances shown in Eq. 3. We now show how to estimate each of the terms in the equation, and then end with a discussion of how the influence of correlations in estimated means and variances of beta-distributed vital rates can be incorporated into Eq. 2.

Calculation of the derivative of log s

The sensitivity of log s to the mean rate is:

(C.2)

 

The sensitivity of to a vital rate mean, , is obtained using the sensitivity of to matrix elements and applying the chain rule (Caswell 2001):

(C.3)

 

where a and b are indices of the matrix elements ea,b, and is thus the summation over all matrix elements. Sa,b is the sensitivity of to the element ea,b, and the summation is over all matrix elements. The second order vital rate sensitivities, such as , that occur in the double summation are more difficult. To estimate these terms, we begin with Eq. C.3. Taking the derivative with respect to vi yields:

(C.4)

 

Finally, we again use the chain rule, this time to find :

(C.5)

 

where the last term in the summation is the second derivative of with respect to two matrix elements and is obtained using Caswell's method (2001). Substitution of Eq. C.5 into Eq. C.4, then Eq. C.4 and Eq. C.3 into Eq. C.2, yields in terms of estimable quantities.

Calculation of the variances of vital rate parameters

The variance of vital rate means we show in Eq. 3 is a standard result accounting for the effect of variability among individuals on each year's measurement (Morris and Doak 2002):

(C.6)

 

Here we use as estimates of our guesses as to true variation in each vital rate from year to year, and we estimate according to the sampling distribution that governs each rate (see How Many Data are Enough? in main text)

To obtain the sampling variance of , the estimated standard deviation of a vital rate i that is caused by environmental variability, we began with the formula for the corrected variance estimate:

(C.7)

 

Here, is the total observed variance in the vital rate i from year to year. This variance includes variability caused by sampling errors within years (), meaning that the two terms on the right hand side of Eq. C.7 are correlated to one another. Using the delta approximation, we estimate that

(C.8)

 

We can use Eq. C.7 to re-express in terms of and , and using relationships between sums of variables can also show that the covariance in the final term of Eq. C.8 is equal to . Assuming that vital rate variation is normally distributed, results from Stuart and Ord (1994) can be modified to approximate the variance of a sampled variance,, as . Using this result with Eq. C.8 yeilds:

(C.9)

 

Finally, we need to convert this result to the variance of , which equals .

We again modify a result from Stuart and Ord (1994) for the sampling variance in estimates of correlation coefficients. Assuming normality (which is not a good assumption much of the time for survival and growth rates),

(C.10)

 

The effects of correlated means and variances for some vital rates

Eq. 2 is a delta approximation that assumes no correlations in the sampling variation of different vital rate means, variances, or correlations. In fact, numerous biological, logistical, and statistical effects could lead to correlations in sampling errors we would make in estimating different rates. While most of these effects will be idiosyncratic (and hopefully small), for vital rates that are probabilities, such as growth and survival which are bounded by 0 and 1, there is a general and predictable correlation between the estimated mean and variance of each vital rate. The maximum variance for a beta-distributed vital rate equals (1 – ), where is, as above, the mean of the rate. This relationship applies either to the true mean and variance or to the mean and variance estimated from a sample (Morris and Doak 2003). This means that a sampled set of values with an intermediate mean is able to have a high, medium or low variance, while a sample with a low or high mean must have a low variance. As a consequence, survival and growth rates with true means less than 0.5 will tend to have a positive correlation between estimated means and variances, while rates with true means greater than 0.5 will show negative correlations between estimated means and variances. If the true variance of a vital rate is low relative to the mean, this correlation will be trivial, but if mean rates are quite high or quite low, the effects of these correlations can be substantial relative to the other terms in the approximation for . As a result, we suggest that when using this approximation, you modify Eq. 2 to include these effects for all survival and other vital rates that are probabilities:

(C.11)

 

Here, is the covariance of and for samples of M individuals in each year.

We do not have a closed form solution for , and thus have used simulations of 10,000 values for each vital rate to estimate each of these covariances numerically.

Testing the accuracy of the approximation

Putting all the pieces together allows the prediction of during the planning of a field study and also allows estimation of the accuracy of predictions coming from studies already conducted. However, is this admittedly rather elaborate approximation accurate? To find out, we plotted the sampling standard deviation (the square root of MSE) in from our simulations against the predictions from Eq. 2 made with the correct vital rates (Table 1) and the simulated values of N and Mi. Across all sets of vital rates and sampling regimes, the two estimates of sampling variability are correlated with r = 0.96. The estimation equation is highly accurate for Low and Medium variability, while for High variability simulations it somewhat underestimates true sampling variation (Fig. C1). In short, Eq. 2 seems to do a excellent job of predicting the sampling variation we could expect for a given life history and sampling regime, with the caveat that it will tend to give optimistic estimates of accuracy for populations with very high real environmental variability in vital rates. Thus, it can be used to set a minimum bound on the amount of sampling needed to achieve a given accuracy of results – while the data collected may give less certain answers, they are very unlikely to do any better.

LITERATURE CITED

Caswell, H. 2001. Matrix population models: Construction, analysis and interpretation. Second Edition, Sinauer, Sunderland, Massachusetts, USA.

Kendall, B. E. 1998. Estimating the magnitude of environmental stochasticity in survivorship data. Ecological Applications 8:184–193.

Morris, W. F., and D. F. Doak. 2002. Quantitative conservation biology: the theory and practice of population viability analysis. Sinauer, Sunderland, Massachusetts, USA.

Oehlert, G.W. 1992. A note on the delta method. American Statistician 46:27–29.

Stuart, A., and J. K. Ord. 1994. Kendall's advanced theory of statistics: v.1: Distribution theory. Sixth Edition. E. Arnold / Halsted, London, UK.

White, G. C. 2000. Population viability analysis: data requirements and essential analyses. Pages 288–331 in L. Boitani and T. K. Fuller, editors. Research Techniques in animal ecology: controversies and consequences. Columbia University Press, New York, New York, USA.

 

   FIG. C1. Comparing the predicted sampling variation of log s from simulated data sets and our analytical approximation (Eq. 2). We plot the standard deviation in predicted log s values for each combination of vital rate parameters and sampling regimes. Symbols indicate results from models with high, medium, and low variance in vital rates (Table 1). Simulation data come from the results described in The costs and benefits of including stochasticity in demographic models of the article.



[Back to E086-062]