Alexander Y. Karatayev, Lyubov E. Burlakova, Sergey E. Mastitsky, and Dianna K. Padilla. 2015. Predicting the spread of aquatic invaders: insight from 200 years of invasion by zebra mussels. Ecological Applications 25:430–440. http://dx.doi.org/10.1890/13-1339.1
Supplement
R code and the data set necessary to conduct the Random Forest analysis.
Ecological Archives A025-027-S1.
Authors
File list (downloads)
Description
Alexander Y. Karatayev
Great Lakes Center, SUNY Buffalo State, Buffalo, NY, USA
E-mail: [email protected]
Lyubov E. Burlakova
Great Lakes Center, SUNY Buffalo State, Buffalo, NY, USA
The Research Foundation of The State University of New York
SUNY Buffalo State, Office of Sponsored Programs, Buffalo, New York, USA
E-mail: [email protected]
Sergey E. Mastitsky
RNT Consulting, Ontario, Canada
E-mail: [email protected]
Dianna K. Padilla
Department of Ecology and Evolution, Stony Brook University,
Stony Brook, New York 11794-5245 USA
E-mail: [email protected]
File list
dreissena_in_lakes_of_belarus.csv (MD5: 3dc2d2f89af3064223358983c785771d)
r_script_random_forest.R (MD5: af1295890d60bc832955e940889e4575)
Description
This Supplementary material contains two files necessary to fully reproduce the results obtained using the Random Forest classifier. The first of these files, dreissena_in_lakes_of_belarus.csv, is a plain text table that has 553 records, each described with the following variables:
1. Lake_Code: numeric codes uniquely identifying each lake (for reference only, not used in analysis explicitely).
2. ZMpresence: indicator of whether a lake is infested with zebra mussel (0 – for non-infested, 1 – for infested).
3. LAREA: lake area
4. LVOL: lake volume
5. MAXD: maximal depth
6. AVED: average depth
7. SPECWATSHED: specific watershed (i.e., drainage area)
8. TRANSP: Secci depth
9. COLOR: water color
10. pH: water pH
11. HCO3: HCO3 content
12. SO4: SO4 content
13. Cl: CL content
14. Ca: Ca content
15. Mg: Mg content
16. TDS: total dissolved solids
17: Fe: Fe content
18. Si: Si content
19. NH4: NH4 content
20. NO2: NO2 content
21. PO4: PO4 content
22. PermOx: permanganate oxydizability
23. N: latitude (decimal degree)
24: E: longitude (decimal degree)
Missing values in the data set are denoted as NA.
The second file, r_script_random_forest.R, loads the data into R (assuming that the file dreissena_in_lakes_of_belarus.csv is stored in the current R working directory), fits the Random Forest model, and plots the results. The analysis relies on three add-on packages: caret, geosphere, randomForest, and ggplot2. All these packages are assumed to be already installed on the user's computer (if not, they can be freely downloaded from the Comprehensive R Archive Network, cran.r-project.org, or installed directly from within R using the following command: install.packages(c("caret", "geosphere", "randomForest", "ggplot2"))).
ESA Publications | Ecological Archives | Permissions | Citation | Contacts