Ecological Archives A025-041-A1

Sigrid D. P. Smith, Peter B. McIntyre, Benjamin S. Halpern, Roger M. Cooke, Adrienne L. Marino, Gregory L. Boyer, Andy Buchsbaum, G. A. Burton Jr., Linda M. Campbell, Jan J. H. Ciborowski, Patrick J. Doran,11 Dana M. Infante,12 Lucinda B. Johnson, Jennifer G. Read, Joan B. Rose, Edward S. Rutherford, Alan D. Steinman, and J. David Allan. 2015. Rating impacts in a multi-stressor world: a quantitative assessment of 50 stressors affecting the Great Lakes. Ecological Applications 25:717728. http://dx.doi.org/10.1890/14-0366.1

Appendix A. Additional methods and complete survey instrument to assess the ecological impact of environmental stressors in the Laurentian Great Lakes.

Choice of elicitation approach:

Many decision-making techniques have been used for risk assessment and other environmental decision-making applications. Group discussions by panels of experts allow the experts to exchange knowledge and take joint responsibility for decisions. However, such discussions can be overly cautious, giving too much weight to alternative viewpoints (Aspinall 2010, Martin et al. 2012), and outcomes can be swayed by "charismatic, confident personalities" (Aspinall 2010) and by the tendency to adjust responses based on knowledge of others' choices (Bikhchandani et al. 1992). Reports also rarely document how decisions or consensus were reached, thereby seeming opaque for users not involved in the original process and lacking scientific rigor due to unrepeatability (Kappel et al. 2012). The Delphi method is one of several group-based decision-making techniques with more structure, where consensus is built by iteratively collecting panelists' opinions via questionnaires and sharing answers anonymously to let participants incorporate others' views into their opinions (Aspinall 2010). However, these techniques can be fraught with unproductive group dynamics and design challenges, and outcomes can be misconstrued as carrying more certainty than may be warranted (Aspinall 2010, Cole et al. 2013).

Elicitation of quantitative ratings is an alternate means of assessing expert opinion to guide environmental decision-making. Individually-administered questionnaires or interviews can describe the full range of opinions on a topic, avoiding artificial consensus from group discussions (e.g., Kappel et al. 2012). Questions can be constructed to elicit the information needed for decision-making (e.g., rating the risk of a given event) either directly or indirectly. Direct elicitation questions (e.g., asking the risk of the event without breaking down the judgment) can be simpler to administer, analyze, and report. On the other hand, indirect elicitation, in which experts answer more focused questions indirectly related to the final information needed and data analysts process the responses to derive the overall judgment, also can be effective (Weber and Borcherding 1993, Low Choy et al. 2009, Martin et al. 2012). For example, in risk assessments, multiple individual lines of evidence can be blended together mathematically through multicriteria decision analysis to understand the relative importance of different risks (Keeney and Raiffa 1976, Linkov et al. 2011). These indirect weight-of-evidence methods foster objectivity and facilitate process documentation for repeatability and transparency by breaking down a judgment into explicit, logical steps and documenting the specific data used in each step (Linkov et al. 2011).

Multicriteria decision analysis and validation:

The scenario comparisons (Part III of survey) used a particular form of multicriteria decision analysis from economics called multi-attribute utility theory, since it offered robust methodology and validation procedures (Neslo et al. 2008), and it had been used n similar ecosystem assessments previously (see main text). Since the scenario comparisons required significant time investments for both survey designers and survey respondents, it could be useful to evaluate the relative benefits of relying on other methods for collecting this information (e.g., direct elicitation or alternative multicriteria decision analysis frameworks). However, we caution that strong differences among these alternatives must be considered carefully (e.g., Forman and Gass 2001, Neslo et al. 2008). For our project, we felt that the indirect elicitation from utility theory offered compelling philosophical and methodological foundations, robust validation options, and a meaningful framework for conceptualizing the complicated task of comparing a large number of disparate stressors.

To analyze the scenario comparisons, the joint distribution of weights of the components was calculated to best predict the proportion of respondents ranking a given scenario as having the most impact. A custom-built program called Universe was used (2011 version, R. Cooke, http://risk2.ewi.tudelft.nl/oursoftware) to produce a weighting matrix for the 5 components of impact. This analysis was run for all respondents together and for each primary work activity separately. For all respondents together, data from the first-ranked scenarios and other scenarios that were ranked in the top 5 by at least 5% of respondents were used to give maximal accuracy for the overall population. However, when the data were analyzed for individual respondent groups, only data from the first-ranked scenarios were used to fit the model. This was because the small sample sizes and large number of zero frequencies for the two smaller respondent groups would have compounded errors during normalization.

We performed out-of-sample validation with lower ranked scenarios for all experts by splitting the experts' rankings into a test set and a validation set. The components of impact were weighted by fitting the test set and then these weights were used to predict the proportions of experts giving each scenario a particular rank in the validation set. Comparing the experts' relative frequencies of ower ranked scenarios (test set) with those emerging from the probabilistic inversion, agreement was quite good. For example, the ordering of the probabilities of first ranked scenarios based on the experts' actual answers vs. the model's predictions were strongly correlated (Spearman rank correlation test of proportions of experts ranking each scenario first, n = 20 scenarios, ρ = 0.74, p < 0.001). Thus, the model could reasonably well predict which scenarios were top ranked by the most experts, even though it did not perfectly predict the percentages of experts making those attestations. Other multicriteria decision model methods provide no out-of-sample validation, so the results cannot be compared with other methods of analyzing the scenario data. However, we were able to compare rankings of stressors derived via direct elicitation (survey Part IIA) to this indirect method (survey Parts III-IV), and we found strong agreement between the methods (see Results of main text).

We found that Great Lakes experts were united in viewing all 5 components of ecosystem impact as important. This contrasted with findings from both the California Current and Massachusetts' coastal waters, where magnitude of change and ecological scope accounted for most of the total impact of stressors (Teck et al. 2010, Kappel et al. 2012). Since the weights of the components of ecosystem impact were different in each study, eliciting this information for new systems appears more necessary than previously thought (Kappel et al. 2012) and is likely more realistic than assuming equal weights (as in Halpern et al. 2007).

Survey distribution and responses:

Scientific researchers: Active researchers identified from literature searches and peer recommendations formed our largest survey group (n = 455 addressees). Individuals were primarily research scientists from government agencies, universities, consulting firms, and nonprofit organizations. Most were found using ISI Web of Science, searching the phrase "Great Lakes" and keywords associated with each stressor. Authors were defined as experts if they had published at least 3 articles on a stressor with a Great Lakes focus between January 2000 and June 2010. Separately, members of our working group identified 76 key individuals for inclusion as researchers. More than half (52.6%) of these scientists also were identified through the literature searches. In addition, the survey was distributed to our 15 working group members and 2 leaders of the project, since these personnel are recognized for their Great Lakes expertise. We have no reason to expect that the working roup's exposure to earlier draft(s) of the survey or other project experiences affected their answers, and our survey pool was large enough that any biases would have minimal effects.

Managers: Natural resource managers were targeted from 3 sources (n = 203 additional addressees). The Great Lakes Fishery Commission provided a list of 82 fisheries managers in the region. The Great Lakes Commission provided a list of contacts associated with Great Lakes Areas of Concern (designated management units from the Great Lakes Water Quality Agreement), from which we included 17 individuals affiliated with state resource management agencies. Finally, we included 143 individuals from a list of participants in the Great Lakes Regional Collaboration in 2005. We chose individuals from this list associated with 4 relevant themes of the Collaboration (invasives, habitat and species, nonpoint pollution, and persistent bioaccumulating toxics) with ".gov," ".state," or ".ca" email addresses, but we removed political representatives from this list. We corrected errors in names and email addresses from this older list when possible, but we were unable to find current email addresses in 39 cases.

NGO representatives: NGO survey respondents were identified from lists provided by the National Wildlife Federation and The Nature Conservancy (TNC) (n = 129 additional addressees). The former included 116 organizations from the Healing Our Waters Coalition, an umbrella group for environmental, conservation, and outdoor recreation organizations, zoos, aquariums, and museums working on Great Lakes protection, restoration, and education. The TNC list comprised TNC representatives from each Great Lakes state, along with two representatives from its Great Lakes Project office, one from Nature Conservancy Canada, and one from Conservation Ontario. These individuals held positions as directors of science, directors of stewardship, or program managers.

Survey delivery: The survey was built and distributed online using the Qualtrics web-based platform (http://www.qualtrics.com/). The 3 survey versions were assigned to contacts randomly and evenly. We collected responses primarily November 2010 - January 2011, and we sent survey invitations and reminders via email. We were notified of 114 undeliverable invitations out of 787 intended recipients; we attempted to resend 48 of these for which we identified an alternate address (Appendix Table B2).

Survey responses: Of the respondents opening the survey, Parts III and IV had lower completion rates, with 46% (141 respondents) completing portions of the last part of the survey as instructed (Appendix Table B2), versus 64% (196 individuals) completing the direct stressor ranking exercise as instructed in Part IIA. We found no significant differences between those finishing and not finishing the survey in gender, position type, or age (gender: CTA, df = 1, Χ² = 0.51, p = 0.47; position type: CTA, df = 3, Χ² = 1.02, p = 0.80; year of birth: Kruskal-Wallis, df = 1, Χ² = 2.35, p = 0.13), suggesting no response bias. Respondents were primarily male (73.1% of 264 people stating gender in Part I) and had a wide age range (born 1932-1983; Appendix Fig. B1). Examining demographics by self-declared primary work activity, respondents doing "other" activities were more often female, but age did not significantly differ among groups (gender: CTA, df = 2, Χ² = 15.00, p < 0.001; year of birth: Kruskal-Wallis, df = 2, Χ² = 0.23, p = 0.89; Appendix Fig. B1).

Survey instrument:

Early drafts of the survey instrument were revised based on testing within our project working group and feedback from 3 regional experts not involved in survey development.

The scenario comparisons (Part III) were refined in several ways from previous studies (e.g., Teck et al. 2010). We named the scenarios numerically rather than with arbitrary stressor labels (Kappel et al. 2012), and we crafted each scenario to include all 5 levels of intensity within the ratings of the 5 components (to ensure no scenarios were better or worse than others in all components) but in different combinations. For example, a hypothetical stressor could have very low (impact level "1") spatial extent, be of low ("2") frequency, have moderate ("3") ecological scope, be of high ("4") magnitude, and have a very long ("5") recovery time. We also explicitly recommended that respondents decide which components of ecosystem impact they considered most important prior to selecting scenarios, and we color-coded the ratings.

Although uncommon, survey respondents can favor items at the beginning (primacy effects) or end (recency effects) of lists in their responses (see Krosnick and Alwin 1987). To avoid any net effect of such tendencies, we varied the order of items within the 20- and 50-item lists of stressors and scenarios using 3 versions of the survey.

Printouts of 1 of the 3 versions of the online (originally html-formatted) survey follow. Note that some terminology in this study differs from that originally proposed in Halpern et al. 2007 [components of ecosystem impact (here) = vulnerability factors (Halpern et al. 2007); temporal frequency = frequency; ecological scope = functional impact; magnitude of change = resistance].

View / download GLEAM survey.

Literature cited

Aspinall, W. 2010. A route to more tractable expert advice. Nature 463:294–295.

Bikhchandani, S., D. Hirshleifer, and I. Welch. 1992. A theory of fads, fashion, custom, and cultural change as informational cascades. Journal of Political Economy 100:992–1026.

Cole, Z.D., H.M. Donohoe, M.L. Stellefson. 2013. Internet-based Delphi research: Case based discussion. Environmental Management 51:511–523.

Forman, E.H., and S.I. Gass. 2001. The analytic hierarchy process—an exposition. Operations Research 49:469-486.

Halpern, B.S., K.A. Selkoe, F. Micheli, and C.V. Kappel. 2007. Evaluation and ranking the vulnerability of global marine ecosystems to anthropogenic threats. Conservation Biology 21:1301–1315.

Kappel, C.V., B.S. Halpern, K.A. Selkoe, and R.M. Cooke. 2012. Chapter 13: Eliciting expert knowledge of ecosystem vulnerability to human stressors to support comprehensive ocean management. Pages 253–277 in: A.H. Perera, C.A. Drew, and C.J. Johnson. (eds.), Expert Knowledge and Its Application in Landscape Ecology. Springer Science + Business Media, New York, New York, USA.

Keeney, R.L., and H. Raiffa. 1976. Decisions with multiple objectives: Preferences and value tradeoffs. J. Wiley, New York, New York, USA.

Krosnick, J.A., and D.F. Alwin. 1987. An evaluation of a cognitive theory of response-order effects in survey measurement. The Public Opinion Quarterly 51:201–219.

Linkov, I., P. Welle, D. Loney, A. Tkachuk, L. Canis, J.B. Kim, and T. Bridges. 2011. Use of multicriteria decision analysis to support weight of evidence evaluation. Risk Analysis 31:1211–1225.

Low Choy, S., R. O'Leary, and K. Mengersen. 2009. Elicitation by design in ecology: using expert opinion to inform priors for Bayesian statistical models. Ecology 90:265–277.

Martin, T.G., M.A. Burgman, F. Fidler, P.M. Kuhnert, S. Low-Choy, M. Mcbride, and K. Mengersen. 2012. Eliciting Expert Knowledge in Conservation Science. Conservation Biology 26:29-38.

Neslo, R., F. Micheli, C.V. Kappel, K.A. Selkoe, B.S. Halpern, and R.M. Cooke. 2008. Modeling stakeholder preferences with probabilistic inversion: application to prioritizing marine ecosystem vulnerabilities. Pages 265-284 in: I. Linkov et al., E. Ferguson, and V.S. Magar (eds.), Real-Time and Deliberative Decision Making. Springer Science + Business Media, Netherlands.

Teck, S.J., B.S. Halpern, C.V. Kappel, F. Micheli, K.A. Selkoe, C.M. Crain, R. Martone, C. Shearer, J. Arvai, B. Fischhoff, G. Murray, R. Neslo, and R. Cooke. 2010. Using expert judgment to estimate marine ecosystem vulnerability in the California Current. Ecological Applications 20:1402–1416.

Weber, M., and K. Borcherding. 1993. Behavioral influences on weight judgments in multiattribute decision making. European Journal of Operational Research 67:1–12.


[Back to A025-041]