Ecological Archives E091-045-D1

James A. Falcone, Daren M. Carlisle, David M. Wolock, and Michael R. Meador. 2010. GAGES: A stream gage database for evaluating natural and altered flow conditions in the conterminous United States. Ecology 91:621.


Identifying the attributes of streamflow required to sustain aquatic ecosystems is critical to developing goals for riverine management or restoration (Richter et al. 1997, Arthington et al. 2006). Riverine biota have evolved in the context of a "natural flow regime" – the quantity, timing, and variability of flow unaffected by human influences over many years – and quantifying that flow regime is important for maintaining ecosystem function and natural biodiversity (Richter et al. 1996, Poff et al. 1997, Poff et al. 2009). Knowledge of the natural flow regime would facilitate assessment of which hydrologic attributes have been altered by humans in a particular stream and the establishment of specific goals for streamflow restoration (Richter et al., 1997). To that end, stream gage-specific expectations or benchmarks for specific hydrologic indicators (e.g., magnitude, frequency, duration, or timing of high and low flows) are desirable for the study and management of river ecology.

Data describing streamflow characteristics are obtained primarily from stream gages, which record the flow of the river at predetermined intervals from minutes to hours on a daily basis. Knowing the characteristics of natural streamflow for any specific stream or river is in most cases not possible because most streams are either ungaged or have been altered by human influences. The natural flow regime must therefore usually be estimated. A viable approach for quantifying the natural flow regime for a particular place is by considering data from nearby or regional gaged locations of reference quality: i.e., stream gages whose watersheds are least disturbed for a particular region (Vogel et al. 1999).

There are at least two requirements for designing a reference stream gage data set for predicting natural flow regimes. First, we need a population of sites having adequate long-term streamflow data from watersheds primarily influenced only by natural processes (e.g., climate, topography). Second, this population of reference stream gages should encompass a wide range of natural conditions (e.g., watershed size) in order to maximize the generality of model predictions.

Natural flow has been modeled or estimated at various scales in the United States (Thomas and Benson 1970, Orsborn 1974, Wallis et al. 1991, Vogel et al. 1999, Holmes et al. 2002, Sanborn and Bledsoe 2006, Carlisle et al. 2009). Several older national-scale data sets include stream gages or streamflow records which had been screened for minimal human influence (Wallis et al. 1991, Slack and Landwehr 1992, Poff 1996); however, they were derived before easy access existed to comprehensive digital geospatial data (e.g., high-resolution imagery, 30-m land cover data, national streamline data set, national dams database). The most comprehensive of these national data sets is the United States Geological Survey’s (USGS) hydro-climatic data network (HCDN; Slack and Landwehr 1992). The HCDN, whose data evaluation ended with water year 1988, identifies stream gages which at some point in their history had periods which represented natural flow, and the years in which those natural flows occurred were identified (Slack and Landwehr 1992). The HCDN watersheds remain a valuable source of historic streamflow data. Because technology and databases have evolved considerably since the late 1980s, however, we believe the identification of streams and rivers in the United States that currently have near-natural flow conditions would be a valuable addition. Most importantly, there is a critical need to publish a comprehensive set of geospatial characteristics of gaged watersheds so that the research community and public can use these data for a variety of purposes.


A. Data set identity: GAGES (Geospatial Attributes of Gages for Evaluating Streamflow): A stream gage database for evaluating natural and altered flow conditions in the conterminous United States.

B. Data set identification code:

C. Data set description: Zipped file containing 25 tab-delimited ASCII text files, representing watershed and site characteristics, one ASCII tab-delimited text file containing variable descriptions, and one MS Excel spreadsheet showing the hydrologic disturbance index score calculation.

Each of the 25 text data files contain 6,785 records, one for each stream gage, identified in each file by the unique identifier "GAGE_ID". The 25 files are: basinid.txt, bas_classif.txt, bas_morph.txt, census_block.txt, census_county.txt, climate.txt, geology.txt, hydro.txt, hydromod_dams.txt, hydromod_other.txt, infrastructure.txt, landscape_pat.txt, lc01_basin.txt, lc_change92_01.txt, lc01_mains100.txt, lc01_mains800.txt, lc01_rip100.txt, lc01_rip800.txt, nutrient_app.txt, pest_app.txt, prot_areas.txt, reach.txt, regions.txt, soils.txt, and topo.txt.

The variable descriptions are provided in gages_variable_desc_sept3_09.txt. It contains 375 records (one for each variable), ASCII text, tab-delimited.

The Excel spreadsheet, disturb_index6785_sept3_09.xls, is provided as a convenience only as demonstration of how the disturbance index scores were assigned.

Principal Investigator(s):

James A. Falcone
National Water-Quality Assessment Program
U.S. Geological Survey
Reston, Virginia 20192 USA

Daren M. Carlisle
National Water-Quality Assessment Program
U.S. Geological Survey
Reston, Virginia 20192 USA

David M. Wolock
National Water-Quality Assessment Program
U.S. Geological Survey
Lawrence, Kansas 66049 USA

Michael R. Meador
National Water-Quality Assessment Program
U.S. Geological Survey
Reston, Virginia 20192 USA

Abstract. Streamflow is a controlling element in the ecology of rivers and streams. Knowledge of the natural flow regime facilitates the assessment of whether specific hydrologic attributes have been altered by humans in a particular stream and the establishment of specific goals for streamflow restoration. Because most streams are ungaged or have been altered by human influences, characterizing the natural flow regime is often only possible by estimating flow characteristics based on nearby stream gages of reference quality, i.e., gaged locations that are least-disturbed by human influences. The ability to evaluate natural streamflow, that which is not altered by human activities, would be enhanced by the existence of a nationally consistent and up-to-date database of gages in relatively undisturbed watersheds.

As part of a national effort to characterize streamflow effects on ecological condition, data for 6785 U.S. Geological Survey (USGS) stream gages and their upstream watersheds were compiled. The sites comprise all USGS stream gages in the conterminous United States with at least 20 years of complete-year flow record from 1950–2007, and for which watershed boundaries could reliably be delineated (median size = 578 km2). Several hundred watershed and site characteristics were calculated or compiled from national data sources, including environmental features (e.g., climate, geology, soils, topography) and anthropogenic influences (e.g., land use, roads, presence of dams, or canals).

In addition, watersheds were assessed for their reference quality within nine broad regions for use in studies intended to characterize streamflows under conditions minimally influenced by human activities. Three primary criteria were used to assess reference quality: (1) a quantitative index of anthropogenic modification within the watershed based on GIS-derived variables, (2) visual inspection of every stream gage and drainage basin from recent high-resolution imagery and topographic maps, and (3) information about man-made influences from USGS Annual Water Data Reports. From the set of 6785 sites, we identified 1512 as reference-quality stream gages. All data derived for these watersheds as well as the reference condition evaluation are provided as an online data set termed GAGES (Geospatial Attributes of Gages for Evaluating Streamflow).

D. Key words: aquatic ecology; dams; hydrologic condition; hydrologic modification; natural flow regime;reference stream gages; streamflow; stream gage network; stream gages; water withdrawal.


A. Overall project description

Identity: GAGES (Geospatial Attributes of Gages for Evaluating Streamflow): A stream gage database for evaluating natural and altered flow conditions in the conterminous United States.

Originator: Original concept and rationale for this project are from Daren M. Carlisle, David M. Wolock, and Michael R. Meador.

Period of Study: Streamflow records for the stream gages in this data set are from the period 1950–2007.

Objectives: Our primary objective was to delineate watersheds and compile geospatial information for USGS stream gages in the conterminous United States that have relatively long periods of record since 1950. Secondarily, we sought to identify watersheds with USGS stream gages that could be used to characterize streamflows which are minimally affected by hydrologic alteration for major environmental settings within the conterminous United States. Specifically, our intention was to apply a consistent set of criteria to identify watersheds with adequate streamflow records, minimal anthropogenic influences based on the most recent data available, and that represented the natural hydroclimatic and land use regions encountered in the conterminous United States. To do this we assembled a large database describing the natural and anthropogenic conditions and features of these watersheds. The primary purpose of this report is to describe the methods for assembling those data and the results, and to provide links for accessing the data, posted as an online data set. This project was undertaken and supported by the USGS’s National Water-Quality Assessment Program (NAWQA Program; U.S. Geological Survey, 2008a).

Abstract: same as above.

Source(s) of funding: USGS National Water-Quality Assessment Program.

B. Specific subproject description

Site description:Conterminous United States.

Research methods:

Gage screening and watershed boundary delineation

We had several screens for including stream gages in the database. The first screen was a minimum period of record. Although no definitive guideline exists, the HCDN network was based on a 20-year minimum period of record (Slack and Landwehr, 1992), and other researchers suggest that at least 20 years of records are required to reliably characterize hydrologic conditions (Poff, 1996; Richter et al., 1997; Schaake et al., 2000). We therefore limited the stream gages selected to those USGS gages with at least 20 years of complete record, 365 or 366 daily values for each of 20 years, in the period 1950–2007. The 20 years did not have to be continuous. Both active and inactive stream gages were considered.

A further screen for including stream gages was that their watershed boundaries had to be located within the conterminous United States. This was due to the unavailability of consistent ancillary data (for example, land cover, dams, streamlines) for areas outside the conterminous United States. We also excluded stream gages known to be on man-made channels, canals, springs, or flumes. Finally, stream gages with contributing drainage areas greater than 50,000 km2 were excluded because of the time-consuming nature of visually evaluating very large watersheds. Approximately 7400 stream gages met the above criteria.

The final screen was the ability to delineate and ensure the accuracy of watershed boundaries. Boundaries for 689 of the watersheds identified above already existed in the NAWQA Program’s GIS database (nominal scale 1:24 k – 100 k). For these stream gages we used available boundaries because they had previously been validated and quality-assured. For the remaining records we initially delineated watershed boundaries using NHDPlus 30-m resolution flow-accumulation and flow-direction grids (U.S. Environmental Protection Agency 2008) and ArcInfo© Grid commands. We compared the drainage basin size calculated from the digital boundaries with the drainage areas as listed in the USGS National Water Information System (NWIS; U.S. Geological Survey, 2008b), which are based on a variety of sources. Drainage areas exist for the majority, but not all, records in NWIS. We visually checked the watershed boundaries of all records that differed from NWIS by greater than 25%, or for which no drainage area existed in NWIS, and we also visually checked a random sample of stream gages (approximately 1,400) that differed from NWIS by less than 25%, comparing watershed boundaries to NHDPlus 100 k streamlines, gage location and topography, to identify those that were in error. Those stream gages that were deemed to be incorrect were either eliminated (because we believed there was no reliable way to delineate the drainage basin, such as in areas of poor relief), or marked for manual re-delineation. Of the random sample of 1,400 stream gages which were within 25% of the NWIS drainage area, less than 2% were in error. We manually re-delineated watershed boundaries on a one-by-one basis as necessary, using the USGS Elevation Derivates for National Applications (EDNA; U.S. Geological Survey, 2008c) watershed delineation tool, and with assistance from the EDNA program. The final data set resulted in 6785 stream gages. Of these, 5951 watershed boundaries were delineated from 30-m NHDPlus data, 689 taken from the NAWQA database, and 145 from EDNA 30-m data.

Calculation of watershed characteristics

Watershed boundaries were used to calculate watershed and site characteristics from nationally available GIS data sets. We used the NHDPlus 100 k scale stream data set to characterize streamlines (U.S. Environmental Protection Agency 2008), based primarily on NHDPlus data sets available in late 2006. The variables calculated included environmental (climate, soils, geology, hydrology, and topography), landscape (land cover, ecoregions), and anthropogenic influence characteristics (infrastructure, population, current and historical presence of dams, canals, presence of pollution discharge sites). The file gages_variable_desc_sept3_09.txt provides detailed metadata for all variables calculated.

As part of this project we delineated the mainstem streamline for each watershed. This was defined as the primary drainage channel as far upstream from the stream gage as could be determined from the NHDPlus stream data set (Fig. 1), i.e., until the main channel upstream from the gage encountered an ambiguous branching where it was not clear which branch had greater drainage. The NHDPlus "streamlevel" attribute provided the basis for an automated process which generally was correct in defining the main channel streamline, however this was supplemented with manual checks on those that were likely to be incorrect (based on their length) as well as numerous random checks throughout. Approximately 700 (10%) of the mainstem lines were adjusted or corrected manually based on a visual examination of the streamline and on-screen display of the "streamorder" attribute. A number of mainstem-scale characteristics were based on these streamlines.