Data and Methods
In this section, we describe our data structure, as well as the methodology for data processing and for the two regression approaches applied in the study, each of which uses different parallelization strategies. These strategies are discussed technically in this section, while the broader challenges with and opportunities for parallelism in this project are described in the following section.
Data Structure
All data for this project was obtained from the ASO team in the form of raster files in GeoTIFF format. We use four raw "snowoff" rasters (Figure \ref{998564}), which we assume are static over time. This set consists of a digital elevation map (DEM) and maps of vegetation height, canopy density (percent of a pixel covered by canopy), and land surface class (rock/soil, vegetation, water, or glacier). In addition, we use 44 "snowon" snapshots (in time) of snow depth and SWE, and a shapefile describing the basin boundaries. Both the snowoff and snowon rasters are derived from raw LiDAR point clouds but are processed by the ASO team into gridded products before we analyze them. For a detailed description of this process, see \cite{Painter_2016}. Snow depth grids are provided at 3-meter and 50-meter resolutions, while SWE is available only at 50-meter resolution, due to the smaller scale of the underlying snow density model used to produce these datasets. With the exception of the land surface class raster, which was of integer rather than single-precision float type, each 3m raster was ~1.2 GB while each 50m raster was ~4.3 MB. The land surface class rasters were half as large.
A raw DEM provides elevation, an important feature in predicting snow depth and SWE distribution (see Section \ref{272805}), but it also provides the basis for extracting numerous other relevant features through spatial transformations. Using a variety of existing, well-optimized algorithms provided within the GRASS GIS software package \cite{Neteler_2012}, we additionally extracted the following six features:
- Slope angle
- Slope aspect (direction)
- Topographic Position Index (TPI, a proxy of the convexity of the slope at a given point)
- Daily-Integrated Clear-sky Irradiance (estimated no-vegetation, no-clouds irradiance received by a pixel on April 1)
- Standard deviation of slope in neighboring pixels (a proxy of terrain roughness)
- Maximum upwind horizon angle (MUHA, measured based on the prevailing wind direction, serving as a proxy for the propensity for wind deposition/scouring)