## boot.glmm.R: R code for estimating P-values by applying the bootstrap to a GLMM likelihood ratio statistic.
## David I. Warton and Francis K. C. Hui
## School of Mathematics and Statistics
## The University of New South Wales
## Last modified: 27/7/10
## This code could be modified to calculate any other statistic by modifying the definition of “p.obs” (the observed P-value) and “ps[i]” (the P-value for the ith resample) – the key point is that the same code needs to be used to calculate each, except “p.obs” uses the original dataset (“dataset”), and “ps[i]” uses the resampled dataset (“data.i”).
## The code has been written so that it can be applied to data with the same structure as that of the supplementary file glmmeg.R:
## “dataset” is a dataframe containing:
## - “success” (the binomial counts)
## - “sample.size” (the number of trials for each count, which need not be constant)
## - “location” (the explanatory variable being used in testing, whether a continuous variable, a factor describing treatment levels, ...)
## - “species” (indicating the different species used in analysis, for which different random effects will be fitted).
## Bootstrapping GLMM's is computationally intensive - if we calculate 1000 GLMM’s, for a dataset of the size of that generated in the glmm.R supplementary file, it might take a couple of minutes to run. For a smaller dataset (for which resampling is more likely to actually be needed), it will take less time.
####################################################################
#First calculate a test statistic for the original dataset:
n.bootstrap=100
fit = glmer(cbind(success, sample.size - success) ~ location + (1|species), family = binomial, data=dataset)
fit2 = glmer(cbind(success, sample.size - success) ~ 1 + (1|species), family = binomial, data=dataset)
p.obs = anova(fit2, fit)$"Pr(>Chisq)"[2] # LR test
n.obs = dim(dataset)[1]
#Now generate references for bootstrap samples:
boot.ref = sample(n.obs, n.bootstrap*n.obs, replace=T)
boot.ref = matrix(boot.ref, n.obs, n.bootstrap)
p.count = 0
eps = 1.e-8
data.i=dataset
ps = rep(NA, n.bootstrap)
#Recalculate the test stat for each bootstrap sample:
for( i in 1:n.bootstrap )
{
i.ref = boot.ref[,i]
data.i$success = dataset$success[i.ref]
data.i$sample.size = dataset$sample.size[i.ref]
fit = glmer(cbind(success, sample.size - success) ~ location + (1|species), binomial, data = data.i)
fit2 = glmer(cbind(success, sample.size - success) ~ 1 + (1|species), binomial, data = data.i)
ps[i] = anova(fit2, fit)$"Pr(>Chisq)"[2] # LR test
}
ps[is.na(ps)]=1
p.boot = mean(ps<(p.obs+eps))
print(paste("Bootstrap P-value:",p.boot))