Data Analysis
Kendall’s correlation τ was used to quantify the monotonic relationships between independent and dependent variables. The Kendall correlation utilized a two-sided p-value, which we accepted as statistically significant below an α value of 0.05. We also conducted a multivariate analysis utilizing multiple linear regression in order to quantify multivariate relationships with the dependent variables. All dependent and independent variables, and their equivalent base 10 log, were tested for normality using a Shapiro-Wilk test. If the non-transformed variable was deemed normal by having a p-value of greater than 0.05, it was used. If it was non-normal, the log transformed variable was used only of it was deemed to be more normal than the non-transformed variable. We used Bayesian Information Criterion (BIC) to choose the best combination of variables driving the relationships (Burnham and Anderson 2004) and accepted a p-value of 0.05 to be statistically significant.
RESULTS
Assessing the Algorithm
We visually assessed the accuracy of the algorithm by plotting the algorithm-generated chemograph characteristics with the actual precipitation and SC data. Each analyzed storm was evaluated graphically, including the distribution of estimated FSCand DSC values (Figures 5 and 6). The median values of the slopes of both the PST and RSC were matched with their median intercepts so that each could be plotted. This resulted in a total of 34 days on which a storm began, 29 of which showed a flushing response and 28 showed a dilution and recovery. For one storm event, the algorithm did not detect the chemograph responses which we were assessing. Each of the remaining 33 storms had either a flush or dilution and subsequent recovery, or both.
Univariate Correlations
The correlations between the FSC and the independent variables revealed statistically significant relationships with IP, IP,max, and PT(Table 2, Figure 7). These 3 relationships were drawn from 27 algorithm-identified flushes in response to precipitation events. This number differed from the 29 available flushes due to a gap in the VPD data record. The Kendall’s τ value for IP was 0.502 (p<0.001), for IP,max was 0.486 (p<0.001), and for PT was 0.422 (p=0.001).
The correlations between the DSC and the independent variables also showed statistically significant relationships with IP, IP,max, and PT(Figure 8). These 3 relationships were drawn from 28 algorithm-identified dilutions in response to precipitation events. The Kendall’s τ value for IP was 0.407 (p=0.003), for IP,max was 0.38 (p=0.006), and for PTwas 0.628 (p<0.001).
The correlations between the RSC and the independent variables resulted in a statistically significant relationship with PT (Figure 9). This relationship was drawn from 28 algorithm-identified recoveries in response to precipitation events. The Kendall’s τ value for PT was 0.564 (p<0.001). There was also a significant relationship with another dependent variable: DSC (τ=0.45; p<0.001).
Multiple Linear Regressions
The multiple linear regression between the log transformed FSC and the independent variables showed three variables driving the relationship (Table 3). The correlation with log transformed IP, log transformed IP,Max, and log transformed ΣVPD resulted in p<0.001 and an adjusted R2 of 0.54. The equation of the regression was
The multiple linear regression between the DSC and the independent variables showed two variables driving the relationship. The correlation with log transformed PT and the slope of the PST resulted in p<0.001 and an adjusted R2of 0.78. The equation of the regression was
The multiple linear regression between RSC and the independent variables showed a single variable driving the relationship. The correlation with log transformed PT resulted in a p<0.001 and an adjusted R2 of 0.39. The equation of the regression was .
DISCUSSION
The algorithm was effective at extracting patterns of solute flushing, dilution, and recovery. Quantifying these chemograph characteristics and comparing them to the environmental variables revealed that the three precipitation variables exerted the greatest influence on these patterns. The methodology we used here can be expanded to other catchments in order to characterize and compare them using the easily obtained, high-frequency data sources that have been produced and collected in recent years. Given the ability of the algorithm to extract both solute flush and dilution characteristics, we expect that other temporal behaviors that occur in other streams could be extracted with the general approach that we used.
Efficacy of the Algorithm
There has been a major increase in in situ sensor-obtained streamwater data (e.g., Pellerin et al. 2010, Rode et al. 2016, Duncan et al. 2018, Fovet et al. 2018), creating a need for automated analyses that explore the data, reveal patterns, and aid in interpretation. There are numerous methods to analyze water quality time series (Hirch et al. 1982, Cun and Vilagines 1997, Faruk 2009), however these were not well-suited to extracting the characteristic hydrologic patterns which were our focus, due to the thresholds and conditional decisions needed to identify each pattern. Our methodology presents a novel way to extract useful information from the deluge of data provided by modern high-yield sensors. While we demonstrate the approach for SC, the general approach may be useful for data from other types of water quality sensors. Our algorithm’s extraction of SC patterns was assessed to be accurate and similar to what would be obtained from human estimation (Figures 5 and 6). The rare instances in which the algorithm was not accurate are acceptable, given the beneficial nature of the automation; essentially that the process is highly robust due to the Monte Carlo approach, which has previously been shown to accurately extract complicated patterns from time series (e.g., Contosta et al. 2016). Similar approaches using machine learning would possibly outperform our algorithm, and we see this as a likely next step in analyzing streamwater sensing data.
Environmental Controls on Chemograph Characteristics
Both flushing and dilution are strongly correlated with rainfall-based variables. This leads to the assessment that both are primarily driven by the mixing of different sources of water, rather than more nuanced controls like seasonality of solute generation or variation in storage (as represented by antecedent moisture indicators). The importance of source water mixing, and the relative contributions of new and old water to streamflow, has been previously discussed at length (Pinder and Jones 1969, Sklash and Farvolden 1979, Kirchner 2003) and are consistent with our results involving SC dilution. However, the functional relationships we show (Figure 8), obtained solely from SC and precipitation data, may be useful in characterizing hydrologic systems for comparison.
In the case of flushing events, our results suggest that a third source of solute-laden water (aside from the typically assumed two members: groundwater and rainwater) is entering the stream system before precipitation is transported to the stream, resulting in SC dilution (Robson et al. 1992, Creed and Band 1998, Inamdar et al. 2009). This third source, however, may quickly exhaust its supply, resulting in water following the same pathway but exhibiting a lower concentration as a storm progresses. For example, at the nearby Hubbard Brook Experimental Forest, the upper intermittent stream reaches are characterized by high dissolved organic carbon and aluminum, and low pH, due to eluvial processes in the bedrock controlled soils that predominate the upper catchment (Bailey et al. 2019). As the catchment wets up, the active portion of the stream network may quickly expand into the upper catchment, causing a flush of solutes from near-stream soils , that may then be quickly diluted by precipitation or from lower SC water as deeper soils lower in the catchment increase their contribution to runoff. Analyses of flushing would benefit from three-end member mixing models being used in future studies, requiring sensing of additional chemical tracers (Burns et al. 2001, Hooper 2003). SC has been an effective catchment hydrology tracer in the past (Davis et al. 1980, Pellerin et al. 2008, Cox et al. 2007), however, the role of the solute flushing in mixing analysis needs to be addressed more thoroughly.
The univariate results indicate that this third source of water is driven into the system not simply by the amount of rainwater added, but maybe more importantly by the intensity at which it is added. Rainwater can cause flushing events in two ways. First, it can displace stored high SC water into the stream during precipitation events, or second, it can reach the stream as high SC water after gaining solutes via mixing during transport (Robson et al. 1992). Because FSC is more correlated with IP and IP,max than PT, we hypothesize that the flush of high SC water is from either near-stream soils, ephemeral streambeds, or development of saturated conditions in shallow to bedrock soils: pathways which allow more rapid movement of water to the watershed outlet, whether that water is precipitation that picked up solutes or is the displacement of ionically enriched stored water.
If this is the case, we predict that the solutes responsible for this flush are those which are shallow (e.g. dissolved organic C and nitrate) rather than those which are weathering-derived (e.g. Ca, Na). If piston flow later takes over as the major contributor of high SC water, as has been shown to be the case in systems without flushing events (Sklash and Farvolden 1979), then we hypothesize that the solutes responsible would be from deeper, weathering-derived sources. Sample collection during precipitation events would indicate which specific solutes are entering the stream system and allow an evaluation of which of these processes drives SC dynamics.
The RSC relationship with PT is expected given it is driven by the dissipation of a pulse of low SC rainwater through higher SC stored water, however we see potential to further explore this chemograph characteristic to understand catchment hydrologic function. As more rainwater enters the system and increases the deviation from the catchment’s normal, high-SC state (shown by a higher DSC), the system responds by recovering more quickly. As the new water leaves the system, the SC will move towards pre-event concentrations. The rate at which this rebound occurs may be used to characterize the catchment’s export of new water. It may be possible to derive hydrogeologic characteristics using RSC, similar to previous hydrograph recession analysis (Brutsaert and Nieber 1977).
The best multiple linear regression models we produced for DSC and RSC ended up with PT as a primary driver, accompanied in the case of DSC with PST, which is due to the manner in which the algorithm quantifies the DSC. The current analysis for these two SC patterns suggests that the system was so influenced by precipitation that the importance of other controls, while physically relevant, were either very difficult to detect, were relatively unimportant, or were restricted by the constraints of available data. Similarly, no seasonal variables significantly influenced flushing, dilution, or recovery behaviors, suggesting that seasonal variation of runoff generation was similarly masked by precipitation characteristics or due to us limiting our analysis to the snow-free season.
The lack of correlation between antecedent conditions (ΣVPD and ISP) and DSC does not agree with previous studies, and accordingly is unexpected, as the drying pressure exerted on a catchment influences the hydrologic storage capacity of its soils, and thus runoff generation (Biron et al. 1999, Detty and McGuire 2010, Grand-Clement et al. 2014). It is possible that there is a threshold system response to drying, rather than a gradient response, thus revealing no correlations in our analyses pertaining to ΣVPD or ISP. Or, drying pressure’s effect may be masked by the importance of precipitation amount and intensity. We expect that this result may change if we had more storms to analyze and could then apply data-intensive, non-parametric multivariate models (e.g., regression trees) due to the complex nature of hydrologic systems.
Cumulative vapor pressure deficit did, however, emerge as a driving factor in our multiple linear regression model for FSC, indicating that the drying pressure exerted on the catchment is a factor in flushing responses. The antecedent conditions here likely influence the connectivity of soil water to the stream, becoming more disconnected with greater ΣVPD. As rainfall connects soils to the stream, a stronger pathway is formed, allowing ions to more freely mobilize. Again, this relationship may become stronger and more clearly important to FSC with the study of additional storms.
Climate change is causing more intense precipitation events with longer inter-storm periods (Allan and Soden 2008, Yu et al. 2016). Our results indicate a likelihood of higher variability in streamwater SC as these climate-driven effects take hold. The greater precipitation amounts and intensities will cause higher spikes in the concentration of solutes at the beginning of a precipitation event and higher dilution of stream solutes later in events. This may have a negative effect on lotic organisms that rely on certain ranges of SC, or concentrations of the ions that SC represents, to carry out biologic functions (Daley et al. 2009). And, depending on the solutes making up the ionic composition of the streamwater, these may impact the waters used for human consumption. Thus monitoring of SC and automated detection of responses provides an effective tool for surface water assessments.
CONCLUSION
Our algorithm extracted three chemograph patterns, which routinely appear in response to rain events. Our comparison of these patterns to independent environmental conditions suggested that the major factors driving the variability in flushing, dilution, and recovery behavior were the precipitation amount and intensity. Seasonal variables were largely unimportant at this catchment or overshadowed by the heavy influence of rainfall. When analyzed using multiple linear regression, antecedent moisture was shown to be a factor in driving flushing responses, but not in driving dilution or recovery responses. Our methodology can be useful for analysis of other catchments, revealing functional relationships between independent, environmental variables and streamwater SC responses. The results of such investigations could help characterize catchments in a robust, objective, and repeatable manner, with implications for guiding water resource management decisions.