Abstract

Features of terrain and vegetation are often used to estimate the distribution of snow within a watershed using statistical modeling approaches, yet little is known of the stability/instability of these models across space, time, and model scale. This is largely due to a lack of the repeated, high-resolution measurements necessary to develop models at multiple points in time, and it inhibits the use of these statistical models in improved water resource forecasts. In this study, we use a novel timeseries to investigate temporal nonstationarity of such statistical models. The dataset consists of gridded, 3m-resolution, LiDAR-derived products depicting elevation, canopy height, canopy density, land cover classification, snow depth, and snow water equivalent (SWE) from the Tuolumne River Basin in California. Each feature is represented by a ~1 GB raster image and roughly 1,500 regressions are run to generate model parameters comparable across a variety of dimensions. Multiple levels of parallelism allow for time-efficient generation of these models using several existing data analysis packages written in high-level languages. While the parallelism exploited in this analysis is, for the most part, embarrassingly parallel or otherwise fairly straightforward, several challenges emerge when using high-level data analysis languages in a High Performance Computing (HPC) environment. These challenges, as well as the benefits of HPC in these scenarios, are discussed and compared. Our scientific results agree with previous consensus in the dominance of elevation and relative insolation as predictors of snow depth. Furthermore, we show that the importance of elevation decreases sharply moving from late winter to mid-summer, that the importance of insolation peaks in spring, and that feature importance is comparable when predicting snow depth and SWE. 

Introduction

Access to clean, dependable water sources is a prerequisite for healthy societies, and one-sixth of the world’s population depends on snowmelt for this water supply \cite{Barnett_2005}. Efficient water use policies and reservoir operation in snowmelt-driven regions thus depend largely on accurate predictions of the magnitude and timing of snowmelt. These are both functions of the spatial distribution of snow water equivalent (SWE), a metric combining snow depth and density. To date, even the most advanced methods for estimating snowmelt-driven water supply rely on sparse SWE observations and their correlations to spring/summer streamflow. By assuming that these correlations will stay constant into the future, we are assuming a stationary climate; however, as climate change warms mountain regions, more winter precipitation will fall as rain instead of snow, and snow will be confined to higher elevations, thus altering spatial patterns and rendering these historical point observation-based correlations less useful. Since future SWE distributions may not mirror previous patterns, there is a growing effort to better observe snow distributions and to improve modeling techniques for interpolating between point observations \cite{Raleigh_2017,Wrzesien_2017,Cai_2017}.
Many such efforts rely on statistical models built on static features of the terrain and vegetation within a given watershed to perform this interpolation \cite{Geddes_2005}. This often takes the form of kriging, or a similar approach; however, little is known of the spatiotemporal stability of these models or the effect of model scale (i.e. the resolution at which the terrain and vegetation features are represented). In the case of spatial and temporal nonstationarity, the knowledge gap is largely a function of a lack of data. In order to draw inference on changes in model parameters across years or across dates within a season, a dataset of SWE measurements with (1) a large enough temporal extent to cover these timeframes, (2) a large enough spatial extent to generate a robust sample representative of the watershed, and (3) a resolution small enough to represent relative features of the terrain (e.g. slope angle) at physically meaningful scales is needed. The recent use of airborne scanning LiDAR for snow depth measurement has made the production of such datasets possible, and the Airborne Snow Observatory (ASO) mission from the Jet Propulsion Laboratory represents the first systematic, repeated snow depth mapping effort over large spatial extents \cite{Painter_2016}. When using LiDAR measurements, snow depth is measured directly and then multiplied by observationally-constrained models of snow density to produce maps of SWE. This accuracy of this joint measurement-modeling approach relies on the fact that snow density displays an order of magnitude less spatial variability than snow depth and is thus well approximated by models \cite{L_pez_Moreno_2013}. While ASO SWE estimates provide operationally relevant information to water resource forecasters, water managers, and reservoir operators, it is not yet a scalable observation method due to high costs of such flights and the logistics of supplying the mobile data/compute centers necessary to produce real-time information for each mapped watershed. However, the estimates produced by these flights can be studied to better understand the degree to which statistical snow depth and SWE models might vary over time.
To accomplish that goal, we apply two simple yet commonly used \cite{L_pez_Moreno_2005,Jonas_2009} regression approaches to the gridded snow depth and SWE timeseries provided by ASO flights over the Tuolumne River Basin in Central California (Figure \ref{961135}), relying on underlying raster images of land cover class, canopy height, canopy density, elevation, and several terrain-relevant transformations of the elevation raster as features. These two regression approaches are a (linear) Spatial Error Model - which takes the form of an Generalized Least Squares regression for which the error term has a spatial autoregressive structure - and a (nonlinear) Random Forest. To understand changes in model structure over time, we leverage the National Center for Atmospheric Research (NCAR)'s distributed computing resources to perform these regressions on 39 different "snapshots" of snow depth and SWE at different dates during the late Winter to early Summer range, across 2013 through 2016. To understand the impact of model scale, we repeat each model at 7 different model resolutions. In total, we execute >1,500 regression models. We compare the changing importance of individual features with "feature importance" metrics and assess the operational significance of temporal model nonstationarity by comparing the root mean squared error (RMSE) of models trained on one snow depth/SWE from one date and tested on another. Spatial nonstationarity, while an integral part of the larger spatiotemporal analysis, is beyond the scope of this report.