APPROACHES AND METHODS
Water data source. Nitrate data from 228 sites across the contiguous United States were analyzed. This includes 147 sites from the National Water-Quality Assessment (NAWQA) project, a network established to monitor stream water quality since 1991 (Spahr et al., 2010). The NAWQA has daily discharge records and monthly water quality data across four land use categories: Agriculture, Mixed, Undeveloped, and Urban. The NAWQA sites have median and mean drainage areas of 150 and 294 km2, respectively. The USGS Water Quality Watch (WQW) database (https://waterwatch.usgs.gov/wqwatch/) includes 71 sites and continuous water quality data (5 minutes to hourly) since 2014. The WQW sites are primarily located in the Midwestern “Corn Belt” (e.g., Iowa, Illinois, Indiana) with sizes ranging from 1 to > 29,000 km2. The USGS discharge and nitrate data were retrieved from the National Water Information System (https://waterdata.usgs.gov/nwis).
We also examined intensively-measured data in 6 undeveloped and 4 urban sites from research network sites, including Critical Zone Observatories (CZOs, http://criticalzone.org/national/), Long Term Ecological Research Network (LTER, https://lternet.edu/), and Watershed Function Scientific Focus Area (SFA, https://watershed.lbl.gov/). The 6 undeveloped sites include Shale Hills watershed (CZO, Pennsylvania) (Brantley et al., 2018), East River and Coal Creek watersheds (Colorado) (Zhi et al., 2019), Sleepers River Research Watershed (Vermont) (Armfield et al., 2018; Sebestyen et al., 2008), Hubbard Brook Experimental Forest W1 and W6 watershed (LTER, New Hampshire) (Dittman et al., 2007; LaBaugh et al., 2013). The four urban sites are Dead Run, Glyndon, Gwynnbrook, and Carroll Park (Duncan et al., 2017; Kaushal et al., 2008), all located within the Gwynns Fall Watershed (Baltimore Ecosystem Study LTER, Maryland).
Land use classification . The NAWQA sites were classified based on the percentage of major land use according to the USGS criteria (Spahr et al., 2010). Agricultural sites have > 50% agricultural land and <= 5% urban land; urban sites have > 25% urban and <= 25% agricultural land; undeveloped sites have <= 5% urban and <= 25% agricultural land; all other combinations of urban, agricultural, and undeveloped land are classified as mixed. Land use in the WQW sites was compiled from multiple sources including EPA water quality reports, USGS watershed management plan, USDA watershed assessment report, and the National Land Cover Database (Hyer et al., 2016; Machesky et al., 2010; Tedesco et al., 2005). Out of the 228 sites, the numbers of Agriculture, Mixed, Undeveloped, and Urban sites are 61, 78, 49, and 40, respectively.
Concentration-discharge (C-Q) data analysis and estimation of shallow and deep water concentrations. A variety of metrics have been proposed for characterizing export regimes, including load-discharge relationship, temporal inequality of load and discharge, and various forms of concentration and discharge relationships (Basu et al., 2010; Jawitz & Mitchell, 2011; Thompson et al., 2011; Zhang, 2018). Here we used the power law form of \(C=aQ^{b}\), a common form for cross-site comparison (Godsey et al., 2009; Herndon et al., 2015). Here \(C\) is the nitrate concentration, \(Q\) is discharge, and \(a\) and \(b\) are fitting parameters with b quantifying the C-Q slopes. For the NAWQA sites with monthly data, C-Q slopes were calculated using all available paired daily data over multiple years. Duplicate and incomplete records and concentrations below the Method Detection Limit (MDL) of 0.01 mg/L were excluded for C-Q fitting (Nolan and Stoner, 2000). Flow-weighted stream concentrations (Cstream in Figures 2 and 3) were calculated by dividing total nitrate load by total discharge for the entire data period.
The WQW sites have data up to 15 min frequency for multiple years (2-4 years typically). High-frequency data were aggregated to daily means of discharge and nitrate concentrations. They were then used to calculate the slope \(b\) for each year so that the b values were at consistent time scale as those from NAWQA. The mean and standard deviation of slope b are reported in Figure 5 and 6. In addition, these high-frequency stream data were used to infer end-member concentrations in shallow and deep waters. Shallow soil water often dominates at high flow conditions, whereas deeper base flow predominates in low flow conditions. Based on this, deep water concentrations (Cdw) were estimated as the average of stream concentrations during low flows at the 5th percentile of annual discharge. Shallow water concentrations (Csw) were estimated as the average of stream concentrations during high flows (95th percentile of annual discharge). The concentration ratio (Cratio) was calculated as Cratio = Csw / Cdw.
Validation of shallow and deep water chemistry estimation.For shallow water concentrations, we used a recent study that compiled measured stream water and soil water concentration from 94 sites with diverse land cover (e.g., agriculture, forest, disturbed forest, grassland, urban) across the globe (Sudduth et al., 2013). To be consistent in land use classification, we grouped “forest”, “disturbed forest”, and “grassland” in Sudduth et al. (2013) as “undeveloped” in this work. With limited measurements of Csw, we cannot directly compare estimated and measured Csw in individual sites. Instead, we examined whether the relationship between estimated Csw and Cstream from this study is close to the derived relationship between measured Csw and Cstream from Sudduth et al. (2013). The relationships would be similar if the end-member estimation could provide a realistic estimation of Csw (Figure 4a).
For deep water chemistry, we used the National Ground-Water Monitoring Network (NGWMN), a compiled database for groundwater level and water quality (https://cida.usgs.gov/ngwmn/). Based on well location and data availability, groundwater nitrate data were retrieved from the Midwest (i.e., Minnesota, Wisconsin, Iowa, Nebraska, Illinois), New Jersey, and Florida for the maximum coverage. Based on the proximity of sites, a number of sub-regions were selected for comparing estimated vs. measured Cdw (Figure S5). Within each sub-region, estimated and measured Cdw were averaged and compared (Figure 4b).
Watershed reactive transport modeling. Watershed modeling was carried out using a recently developed code BioRT-Flux-PIHM, to explore the effects of subsurface vertical profile for N (Zhi et al., 2019). A base model was set up at the Conewago Creek watershed, a tributary to the Susquehanna River in the Chesapeake Bay. It drains an area of 136 km2 and is dominated by agricultural (47%), forested (43%), and developed (17%) lands. The base model reproduced the temporal discharge and the slightly flushing C-Q pattern as observed at the site (Figure S6 and S7). Monte-Carlo simulations of 500 cases were carried out to cast the base model results to broader conditions spanning a range of N vertical distribution and reaction kinetics representing different land use conditions (Figure 6). We then derived a general relationship for the dependence of C-Q b slope on the ratio of shallow versus deep water concentration (Csw / Cdw). This simple relationship was further verified byb values estimated based on stream water chemistry from 81 sites, and a few intensively monitored sites with detailed soil and groundwater measurements. Other details of model setup and calibration, and Monte-Carlo simulation are in the Supporting Information (SI).