Appendix E. More on measuring deviation. A pdf file of this appendix is also available.
How might fluctuations best be quantified,
assuming that we have determined comparable intervals of time and a curve against
which to compare the species involved in the lottery we have described? Let
be the number expected in bin
i according to, for example, an exponential distribution and
the
number actually observed. One measure of deviation from the expected smooth
curve is
|
(E.1)
|
This quantity is well defined but
will not have a
distribution
unless
is normally distributed
about
. In Kelly et al. (2001),
the measure
|
was employed, which is an analog
of
for Poisson distributed
(Baker
and Cousins 1984).
A measure of the fractional deviation from an expected (or hypothesised) smooth curve is
|
(E.2)
|
Both (E.1) and (E.2) can be divided into interesting and uninteresting pieces, as follows:
| Let | ![]() |
where
is
an interesting deviation (in this paper, a fluctuation in recruitment caused
by an environmental effect) and
is
a statistical fluctuation (which is not interesting). Then to the extent that
and
are
uncorrelated
|
(E.3)
|
|
(E.4)
|
If
is
Poisson distributed about
,
and
fluctuates.
Thus
and
.
In (E.3) the second (statistical) term is independent of the sample size or shape of the expected distribution (but depends on the number of bins) but the first (interesting) term depends on the shape and the sample size. In (E.4) the situation is reversed: the first (interesting) term is independent of the shape or sample size, whereas the second (statistical) term depends on both – but can be calculated for a given expected distribution.
A measure of the
type
can be used to establish deviation from some hypothesized distribution beyond
the merely statistical, but care may be necessary in interpreting differences
in the estimator of
which is
.
|
(E.5)
|
The measure we are after for comparison
with the lottery model is the measure of fractional deviation
for
which we took the estimator
.
|
(E.6)
|
There are two caveats that should be borne in mind. The first is that the statistical terms have a distribution and the corrections in (E.5) and (E.6) are the mean values of the statistical terms. The second caveat is that the populations of individual bins may not be Poisson distributed.
The variance on (E.5)
due to statistical fluctuations about
which
are Poisson distributed is
|
and the variance on (E.6) due to such statistical fluctuations is
|
We do not think it possible
(or necessary) to extract a better estimate of the variance V on
than
|
(E.7)
|
without a rather complete understanding of the details of each two component system and the larger forest environment within which they operate.
Baker, S., and R. D. Cousins. 1984. Clarification of the use of chi-square and likelihood functions in fits to histograms. Nuclear Instruments and Methods in Physics Research 221:437442.
Kelly, C. K., H. Banyard Smith, Y. M. Buckley, R. Carter, M. Franco, W. Johnson, T. Jones, B. May, R. Perez Ishiwara, A. Perez-Jimenez, A. Solis Magallanes, H. Steers, and C. Waterman. 2001. Investigations in commonness and rarity: a comparative analysis of co-occuring, congeneric Mexican trees. Ecology Letters 4:618627.