Appendix B. Performance of regression quantile rankscore tests for models with hidden bias. A pdf version is also available.
Confidence intervals for regression
quantile estimates commonly are computed by inverting rankscore testing procedures.
Interval coverage for estimates made with hidden bias in the models was estimated
with a simulation experiment evaluating Type I error rates of the asymptotic
Chi-square T, permutation F, and double permutation F rankscore
tests (Cade 2003). One thousand random samples for n
= 20, 30, 60, 90, 150, and 300 were drawn without replacement from the finite
population of N = 10,000 blocks for the interference interaction generating
model with no spatial structure, no correlation between measured (X1)
and unmeasured (X2) variables, and
0
= 1.0,
1 = 0.41,
2
= 0.0, and
3 = -0.0001
as in Fig. 1. Data were generated with random number functions available in
S-Plus 2000 (Insightful Corporation, Seattle, Washington, USA). Estimates of
0(
)
and
1(
)
were made for each sample, and null hypotheses H0:
0(
)
=
0(
)
, and H0:
1(
)
=
1(
)
were evaluated, where
0(
)
were the parameter values 5.0662, 2.7230, 2.0720, 1.4503, 1.1368, 0.9186, 0.7304,
0.5935, and 0.3784 and
1(
)
were the parameter values 0.3445, 0.3533, 0.3464, 0.3034, 0.2139, 0.1170, 0.0628,
0.0431, and 0.0227 corresponding to
= 0.99, 0.95, 0.90,
0.75, 0.50, 0.25, 0.10, 0.05, and 0.01 quantiles, respectively (Fig. 1). This
simulation approach evaluated whether the confidence interval coverage estimated
by inverting the rankscore tests included the parameter values
0(
) and
1(
)
with the stated confidence level (1 -
)
given that the error distribution included effects of the unmeasured variable.
We've previously established the degree that
0(
)
and
1(
)
were biased relative to
0(
)
and
1(
).
Rankscore tests for hypotheses on parameters were evaluated either with a Chi-square
distribution (T) or by permutation (F) and were conducted with
routines available in the Blossom statistical package (http://www.fort.usgs.gov/products/software/software.asp).
Unweighted estimates and rankscore
tests provided liberal error rates for H0:
1(
)
=
1(
) for
< 0.90, consistent with simulations when the model form was completely specified
and heterogeneity was >5 standard deviations across the domain of X
(Cade 2003). It was only at higher quantiles
0.90, where there was a reduced rate of change between
1(
)
(see Fig. 1), that unweighted estimates and rankscore tests provided reasonable
coverage (Fig. B1). The permutation F rankscore test maintained better
Type I error rates than the T rankscore test for smaller samples at more
extreme quantiles (Fig. B1), similar to simulations without hidden bias and
a location\scale form of heterogeneity (Cade 2003). Unweighted
estimates and T rankscore tests provided good Type I error rates for
H0:
0(
)
=
0(
) across all but the most
extreme quantiles (
= 0.01 and 0.99), whereas F rankscore
tests had slightly liberal error rates because the permutation structure used
did not account for all the sampling variability when null models were forced
through the origin (Fig. B2), similar to simulations in Cade
(2003). A double permutation scheme (Cade 2003) improved
Type I error rates for the permutation F test as demonstrated below for
weighted estimates of
1(
).
Weighted estimates and rankscore
tests were simulated by constructing weights based on the regression quantile
parameters for the N = 10,000 finite population (Fig. 1). The pattern
of increasing
1(
) with
increasing
was not a simple location-scale form because
changes in
1(
) did
not mirror those of
0(
)
across all quantiles, although they both increased linearly for 0.20
0.80 (Fig. 1). A variant of the bandwidth
approach based on changes in
0(
)
and
1(
) near the quantile
(
) of interest (Koenker and Machado 1999)
was used to provide weights for weighted estimates and rankscore tests in simulations.
Weights were, thus, based on the N = 10,000 population and not estimated
for different samples to avoid undue complexity in the simulation experiment.
Weights were computed by taking the average pairwise difference between
0(
)
and between
1(
) in
an interval of
plus or minus 0.01 for
= 0.05, 0.10, 0.90, and 0.95 and in an interval
plus or
minus 0.005 for
= 0.01 and 0.99. For
= 0.25, 0.50, and 0.75 there were similar rates of change in the parameters
and weights were computed based on pairwise differences in the interval
= [0.25, 0.75]. Weights, w(
), were the reciprocal
of the average pairwise differences divided by the associated interval width
used in their computation (0.01, 0.02, or 0.50): w(0.99) = (48.825 +
0.377X1)-1, w(0.95) = (9.195 - 0.041X1)-1,
w(0.90) = (3.270 + 0.051X1)-1, w(0.75)
= w(0.50) = w(0.25) = (0.286 +.0.110X1)-1,
w(0.10) = (0.900 + 0.129X1)-1, w(0.05)
= (0.774 + 0.169X1)-1, and w(0.01) = (2.589
+ 0.288X1)-1. Weights, w(
),
were then multiplied by y and X to compute weighted regression
quantile estimates and their associated rankscore tests.
Type I error rates for H0:
1(
) =
1(
)
were maintained for the weighted T test across all quantiles except for
= 0.01 (Fig. B3). The weighted F test required a
double permutation scheme to maintain correct Type I errors because under the
null model the estimate is implicitly forced through the origin (Cade
2003). At extreme quantiles and smaller n the weighted T test
became extremely conservative compared to the weighted F test. An example
of the effect of the standard permutation compared to double permutation schemes
for weighted regression quantile estimates are in Fig. B4. Weighting provided
improvements to error rates for H0:
0(
)
=
0(
) for the T test
and double permutation F test for most quantiles (Fig. B5). Both weighted
T or F rankscore tests had Type I error rates that deviated more
from nominal values at smaller samples sizes (n < 150) and more extreme
quantiles (
= 0.01, 0.05, 0.95, and 0.99).
Type I error rates for the cubic
polynomial trend surface were evaluated for the interference interaction model
with no spatial structuring (Fig. 1). Estimates of
0(
),
1(
),
2(
),
3(
),
4(
),
5(
),
6(
),
7(
),
8(
),
9(
), and
10(
),
were made for each sample and the null hypothesis H0:
2(
)
=
3(
) = ... =
10(
)
= 0 was tested, where
2(
)
-
10(
) were parameters
(all zero) for the nine terms of the full cubic polynomial trend surface. Here
Type I error rates were well maintained by both the T and F rankscore
tests because under the alternative model there was no relation between the
spatial trend surface and the response for any quantile (Fig. B6). The permutation
evaluation of the F statistic provided better Type I error rates than
the asymptotic Chi-square evaluation of the T statistic for smaller n
at more extreme quantiles, as also observed for models without hidden bias (Cade
2003).
A small simulation experiment was
conducted to evaluate power of the regression quantile estimates and rankscore
tests to detect spatial trend surfaces. Samples (n = 1,000) were taken
from the spatially structured, interference interaction population of N = 10,000
blocks (Fig. 3), and the model y =
0(
)X0
+
1(
)X1
+
2(
)X1×LAT
+
3(
)X1×LONG
+
4(
)X1×LAT
2 +
5 (
)X1×LONG
2 +
6(
)X1×LONG
3 was estimated and rankscore tests conducted for H0:
2(
)
=
3(
) = ... =
6(
)
= 0. Because the simulated effect of the spatial trend surface was homogeneous
across quantiles, no weighting was used in the simulations. Power greater than
80% with
= 0.05 was achieved for n
150 for
= 0.05 - 0.90. Power was 52% for
= 0.95,
30% for
= 0.01, and 7% for
= 0.99
at n = 150. The F test had slightly greater power than
the T test for
= 0.01 and 0.99 at n
< 150 and equivalent power otherwise.
Cade, B. S. 2003. Quantile regression models of animal habitat relationships. Dissertation, Colorado State University, Fort Collins, Colorado, USA.
Koenker, R., and J. A. F. Machado. 1999. Goodness of fit and related inference processes for quantile regression. Journal of the American Statistical Association 94:12961310.
|
| FIG.
B1. Estimated Type I error rates for |
|
| FIG.
B2. Estimated Type I error rates for |
|
| FIG.
B3. Estimated Type I error rates for |
|
| FIG.
B4. Cumulative distributions for 1,000 estimated Type I errors for the Chi-square
distributed T (dashed line), permutation F (square dotted line),
and double permutation F (solid line) weighted rankscore tests for H0:
|
|
| FIG.
B5. Estimated Type I error rates for |
|
| FIG.
B6. Estimated Type I error rates for |