Watershed Modeling
Travis Zalesky
Raster Analysis - Part 1
As part of UA GIST 602A
Travis Zalesky
Raster Analysis - Part 1
As part of UA GIST 602A
Figure 1. Olympic Peninsula study area locator map. Esri, et. al., Charted Territory base map.
The Olympic Peninsula is a highly complex, mountainous region containing over 9000Km2 of mixed conifer forests. It is, largely rural, with much of the land belonging to either the Olympic National Park or National Forest. Additionally, it is a temperate rainforest, and one of the rainiest regions of the US. Therefore, it contains many, large rivers and is comprised of many watersheds, including the second-largest watershed in the state. Using the Olympic Peninsula as a case study, I will detail the workflow and methods of creating a watershed model. The complexity and use cases of watershed models will be discussed, and the data derived herein will be further utilized in additional research presented in Raster Analysis - Part 2.
In recent decades the explosive growth in computer, as well as remote sensing, technology has enabled the creation and sharing of increasingly high resolution raster models of space. One of the most widely utilized and well know raster data sets available today is the digital elevation model (DEM). Projects such as U.S. Geological Survey (USGS) 3D Elevation Program (3DEP) have resulted in millions of miles of high quality DEMs, available for free to the public (USGS, 2021). Despite the immense value of these products, they frequently contain errors and are not immediately suitable for all applicable research. One specific use case for DEMs is the creation of a runoff, or watershed model, which I will demonstrate herein.
The area chosen for this case study is the Olympic Peninsula of WA. It is a region of over 9000Km2 which consists primarily of mixed conifer forest. Largely a rural, mountainous, and inaccessible region, its primary distinction is its forest recreation opportunities, including the Olympic National Park, the 5th largest National Park outside of Alaska, as well as its 60 named glaciers (National Park Service, [n.d]). It is a temperate rainforest, one of only two in the conterminous US and one of the rainiest regions in North America. Due to its complex topography and high levels of precipitation, the Olympic Peninsula contains many major rivers and watersheds. Furthermore, the study area was extended S to the banks of the Columbia River to maximize coverage of the Chehalis River watershed, one of the most important watersheds on the Olympic Peninsula and the second-largest watershed in WA (WA Department of Ecology, [n.d.]).
The creation of a watershed model broadly follows three steps. First, data is preprocessed. This includes such tasks as identifying the study area, locating and downloading the appropriate data, and merging and/or cropping the data as necessary to fit the proposed study area. Second, errors must be identified and rectified within the DEM, a step often referred to as "conditioning". Lastly, the watershed model is constructed from a combination of hydrologically conditioned DEM flow direction and outflow "pour points". Much of the processing and analysis detailed in this section is dedicated to the location and creation of these pour points.
Figure 2: The nine USGS 1/3 arc second 3DEM data files that cover the study area, merged to a common vertical scale. The study area is outlined in red.
Nine, 1/3 arc second DEMs were obtained from the USGS representing a 3x3 grid with a continuous extent from 47° to 49° N and -123° to -125° W. Raw GeoTIFF files were imported into R v4.3.1 (R Core team, 2021) using the terra package (Hijmans, 2023). Each individual raster image was merged with the others to create a large SpatRaster object with a common vertical scale, which was then exported as a TIFF file (Figure 2). The merged DEM was then imported into ArcGIS Pro v3.1.0 and the raster was clipped to the study area extent shown in Figures 1 and 2 (Figure 3).
The clipped DEM was then explored using the Flow Direction and Sink tools. The flow direction was determined throughout the study area using an eight direction (D8) flow regime (Figure 4). This initial flow direction raster is subsequently used to identify sinks using the Sink tool. A sink in a DEM is defined as an area which does not have any outflow (Figure 5). In other words, it is a pixel which is entirely surrounded by pixels of equal or greater height (elevation). Considering that we want to model water flow across space, these sinks will be problematic in future analysis if not dealt with. Using the Fill tool on the original DEM, I created a hydrologically conditioned DEM by raising (or filling) all the identified sinks by the minimum amount required to allow the sinks to drain (Figure 3).
Figure 3. ArcGIS pro workflow model for preprocessing, and conditioning a DEM for watershed modeling. Other surface parameter calculations derived from the conditioned DEM also shown.
Figure 4. A graphical representation of the D8 flow regime utilized in the ArcGIS Pro Flow Direction tool.
Figure 5. A simple example of a sink in a DEM, where the center cell is surrounded by cells of greater elevation. Water (red arrows) flows into the center cell and has no outlet.
Figure 6. A simple example of a flow accumulation raster. Water (red arrows) flows downhill into adjacent cells and the number of cells flowing into each cell is totaled (blue numbers).
Figure 7. A graphical example of the Strahler stream order method. Figure obtained from ESRI ArcGIS Pro documentation, available at pro.arcgis.com.
Using the hyrologically conditioned DEM I ran the Flow Direction tool again, as before, and forcing edge cells to flow outward. This corrected flow direction layer will be used in multiple future processing steps. The Flow Accumulation tool was then used to count the number of pixels which flow into ever pixel across the study area (Figure 6).
Streams and rivers were identified by applying a conditional statement to the flow accumulation raster, selecting only cells with a value ≥ 250. The minimum accumulation value was chosen by comparison with regional base maps. The corrected flow direction raster was then combined with the streams raster to calculate the Strahler stream orders using the Stream Order tool (Figure 7; ESRI, [n.d]). Final stream output was converted to a vector based object using the Stream to Feature tool with the stream order preserved as an attribute of the new polyline layer.
The stream vectors with order < 5 were graphically hidden from the output using dynamic scale ranges to visually declutter the output and help with the next processing step. A new point feature class was created for pour points and snapping was turned on. Pour points were manually determined for all rivers of order ≥ 5 that drain into one of, (1) the Pacific Ocean, (2) the Salish Sea, or (3) the Columbia river, which define the bounds of our study area. These point features were then rasterized to match the conditioned DEM, and points were snapped to the cell with the highest flow accumulation within a snap radius of 1.0 x 10-4 degrees (about 11m, 1.1 x pixel size).
Finally the snapped and rasterized pour points were combined with the corrected flow direction model to generate a large watershed model covering the Olympic Peninsula and SW WA to the Columbia River (Figure 8).
Figure 8. ArcGIS pro workflow model for development of a watershed model from a hydrologically conditioned DEM. Continued from Figure 3.
Two additional surface parameters, slope and aspect, were derived for the study area which are not directly relevant to the watershed analysis, but which are derived from the hydrologically conditioned DEM and are likely to be utilized in other studies (see Raster Analysis - Part 2). Both of these layers were calculated using the Surface Parameters tool. The slope was calculated in degrees and both layers were calculated with a local surface fitted with a quadratic interpolation function (Figure 3).
From the initial USGS DEMs, across this large study area, more than 1,000,000 sinks were identified. Some of these sinks may have represented legitimate topographical features, however the majority of these features will have been errors. Additionally, many of these sinks corresponded to the region's numerous lakes as determined by comparison to regional basemaps (data not shown).
Within the study area, flow accumulation values range from 0 to 87.6 million. At a raster resolution of ~10m2, this corresponds to a maximum drainage area of roughly 8,760Km2. Additionally, more than 400,000 first order streams were identified, and stream order ranged from 1 to 9. There were a total of 227 streams of order 5 or greater which intersect the study area coastline (Figure 9). Pour points were manually defined for each of these streams before being snapped, which resulted in 227 unique watersheds within the study area (Figure 10).
Figure 9. The major rivers of the Olympic Peninsula, and their tributaries. Stream order calculated using the Strahler method, with minimum stream order ≥ 5.
Figure 10. Two hundred and twenty-seven distinct watersheds across the Olympic Peninsula. Primary drainage of identified watersheds must be of Strahler stream order ≥ 5.
Established workflows for the generation of a watershed model from a DEM requires a minimum of 7 processing steps which can be generally described in three stages, preprocessing, conditioning, and watershed modeling.
Preprocessing is the simplest aspect of this workflow, from a technical perspective, however it can be far from straightforward. Determining a study extent and finding the appropriate data sets can be one of the most difficult challenges of any study. Selecting a study area that is either too large or too small can severely limit what questions can be answered by your analysis. In the context of a watershed analysis, it may not be immediately obvious how large your study area needs to be without prior knowledge. For example, while the primary focus of this project was to define the watersheds of the Olympic Peninsula, the study area had to be substantially extended to the South to sufficiently capture the Chehalis River watershed (Figure 10, large teal colored watershed). Even so, a segment of the watershed was unintentionally cut off in the East by the study extent. If future research were to focus on the Chehalis River specifically, significant data re-processing would be required. Additionally, while it may be tempting to select the highest resolution data available, this comes at an additional cost. File storage and downloading times are directly correlated to data resolution, and, perhaps most significantly, data processing times can increase exponentially with increased resolution. Using ArcGIS Pro software on a Windows 11 desktop with an AMD Ryzen 7 5700G processor and 16 GB of RAM, multiple of the processing steps outlined above took greater than 1 hour to run, and several processing steps had to be strategically scheduled to run overnight. Depending on the study extent and the resources available, using the highest resolution data may not be advisable, or indeed even possible.
Hydrologically conditioning a DEM is perhaps the most critical part of creating a watershed model. Simply put, conditioning comprises identifying and subsequently filling sinks. A sink is a pixel (or group of pixels) which are lower in elevation than all the pixels surrounding it. They are so-named because they resemble a sink without a drain. Natural sinks can and do exist, but they are more likely to be errors within the DEM. While it is true that lake bottoms are natural sinks, it is not typically true that there is no outflow for most natural lakes. Moreover, the objective of a hydrologically conditioned DEM is not to accurately model lake bathymetry, but instead to model water flow, which is more accurately represented by water surface elevation rather than lake bottom. Therefor, it is critical that sinks are identified and filled prior to performing a watershed analysis. In some instances, it may not be appropriate to fill every sink, and it is important to know the context of your specific study area.
Only once you have conditioned your DEM can you proceed to developing your watershed model. At minimum, a watershed model requires a flow direction raster and pour points, which describe the flow of water across a surface and river outlets respectively. In some cases, you may know (or be given) pour points a priori. However, it is often not immediately apparent where the appropriate pour points lie. The majority of the workflow described in figure 8 is dedicated to locating and creating pour points. Assuming that no additional data is available beyond the input DEM it is possible, although not trivial, to find the pour points.
Firstly, flow direction and flow accumulation intermediate layers must be calculated. Then, from the flow accumulation layer, a rasterized model of streams can be created using a conditional statement. Depending on the study area and resolution of the flow accumulation raster, the minimum accumulation value for a stream will vary. For instance, a stream in an arid region with a high resolution dataset will require a much higher accumulation value before forming than a very wet area using low resolution data. Depending on the precision needed for your study it may be possible to compare your stream raster to existing basemaps using a variety of threshold values (as shown here), otherwise detailed ground truthing through field surveys may be required. Once you have determined your threshold flow accumulation value and created your streams raster, you may optionally calculate the stream order using either the Strahler or Shreve methods. Both methods have their benefits, however the Shreve method requires substantially longer processing times, therefor I opted to use the Strahler method for this analysis. While not strictly necessary, this is a highly recommended processioning step because it adds additional detail to the streams raster which can be used to easily identify tributaries, further categorizing and differentiating streams and rivers. Similarly, while not mandatory, it is highly advisable to convert your streams raster to a vector feature class. Vectors are not only a more pleasing and intuitive graphical representation of streams, they also support snapping, which will massively aid in the creation and placement of pour points. Unfortunately, in the absence of a highly detailed coastline polygon or polyline layer, it is not easy to automate the generation of pour points. This step requires careful examination of the streams data layer, and comparison to existing base maps will likely be required. Compromises and simplifying assumptions will likely be required to limit the number of pour points in a given study. For example, watersheds in this study area were limited to outlets of Strahler stream order 5 or greater. This reduced the number of pour points to be created from thousands to 227, while still providing adequate coverage of the study area. Nevertheless, creation of 227 precisely placed pour points was not a trivial undertaking. It required hours of careful, detail oriented work with substantial fine-tuning of point placement. Too few pour points may not sufficiently cover the study area, while too many pour points will be arduous to create and may result in errors in the subsequent watershed analysis. Once pour points have been created, they are then rasterized and snapped to the associated flow direction raster. Finally, the pour points raster is combined with the flow direction raster to generate a watershed model.
The Olympic Peninsula is a highly complex, mountainous region dominated by mixed conifer forests. It is over 9000Km2, largely rural, and much of the land is federally protected. Notably, it is home to the Olympic National Park, which alone consists of 18 different watersheds (not including the park beaches). Among the watersheds identified in this study are the Chehalis River watershed, the second largest in the state, and the Skokomish watershed, which will be explored in detail in Raster Analysis - Part 2.
The watershed model developed in this exercise will be utilized and further evaluated to create a flood risk analysis within the Skokomish River watershed, and is expected to be of ongoing use in future research projects.
U.S. Geological Survey, 20230608, USGS 1/3 Arc Second n47w123 20230608: U.S. Geological Survey, downloaded from the nationalmap.gov.
U.S. Geological Survey, 20220505, USGS 1/3 Arc Second n47w124 20220505: U.S. Geological Survey, downloaded from the nationalmap.gov.
U.S. Geological Survey, 20230608, USGS 1/3 Arc Second n48w123 20230608: U.S. Geological Survey, downloaded from the nationalmap.gov.
U.S. Geological Survey, 20220505, USGS 1/3 Arc Second n48w124 20220505: U.S. Geological Survey, downloaded from the nationalmap.gov.
U.S. Geological Survey, 20200213, USGS 1/3 arc-second n47w125 1 x 1 degree: U.S. Geological Survey, downloaded from the nationalmap.gov.
U.S. Geological Survey, 20200109, USGS 1/3 arc-second n48w125 1 x 1 degree: U.S. Geological Survey, downloaded from the nationalmap.gov.
U.S. Geological Survey, 20200107, USGS 1/3 arc-second n49w123 1 x 1 degree: U.S. Geological Survey, downloaded from the nationalmap.gov.
U.S. Geological Survey, 20200109, USGS 1/3 arc-second n49w124 1 x 1 degree: U.S. Geological Survey, downloaded from the nationalmap.gov.
U.S. Geological Survey, 20200109, USGS 1/3 arc-second n49w125 1 x 1 degree: U.S. Geological Survey, downloaded from the nationalmap.gov.
ESRI. (n.d.). How Stream Order works [Technical Manual]. How Stream Order Works—ArcGIS Pro | Documentation. Retrieved February 29, 2024, from https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-analyst/how-stream-order-works.htm
Hijmans, R. J. (2023). terra: Spatial Data Analysis. https://CRAN.R-project.org/package=terra
National Park Service. (n.d.). Olympic Fun Facts [Government Website]. Olympic Fun Facts - Olympic National Park (U.S. National Park Service). Retrieved February 29, 2024, from https://www.nps.gov/olym/learn/management/olympic-fun-facts.htm
R Core Team. (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. https://www.R-project.org/
U.S.G.S., 3D Elevation Program (3DEP) (2021), National Elevation Dataset (NED) 1/3 arc-second.
WA Department of Ecology. (n.d.). Chehalis Basin Strategy [Government Website]. Strategy - Washington State Department of Ecology. Retrieved February 29, 2024, from https://ecology.wa.gov/water-shorelines/shoreline-coastal-management/chehalis-basin/strategy