APPROACHES AND METHODS
Water data source. Nitrate data from 228 sites across the
contiguous United States were analyzed. This includes 147 sites from the
National Water-Quality Assessment (NAWQA) project, a network established
to monitor stream water quality since 1991 (Spahr et al., 2010). The
NAWQA has daily discharge records and monthly water quality data across
four land use categories: Agriculture, Mixed, Undeveloped, and Urban.
The NAWQA sites have median and mean drainage areas of 150 and 294
km2, respectively. The USGS Water Quality Watch (WQW)
database (https://waterwatch.usgs.gov/wqwatch/) includes 71 sites
and continuous water quality data (5 minutes to hourly) since 2014. The
WQW sites are primarily located in the Midwestern “Corn Belt” (e.g.,
Iowa, Illinois, Indiana) with sizes ranging from 1 to >
29,000 km2. The USGS discharge and nitrate data were
retrieved from the National Water Information System
(https://waterdata.usgs.gov/nwis).
We also examined intensively-measured data in 6 undeveloped and 4 urban
sites from research network sites, including Critical Zone Observatories
(CZOs, http://criticalzone.org/national/), Long Term Ecological
Research Network (LTER, https://lternet.edu/), and Watershed
Function Scientific Focus Area (SFA, https://watershed.lbl.gov/).
The 6 undeveloped sites include Shale Hills watershed (CZO,
Pennsylvania) (Brantley et al., 2018), East River and Coal Creek
watersheds (Colorado) (Zhi et al., 2019), Sleepers River Research
Watershed (Vermont) (Armfield et al., 2018; Sebestyen et al., 2008),
Hubbard Brook Experimental Forest W1 and W6 watershed (LTER, New
Hampshire) (Dittman et al., 2007; LaBaugh et al., 2013). The four urban
sites are Dead Run, Glyndon, Gwynnbrook, and Carroll Park (Duncan et
al., 2017; Kaushal et al., 2008), all located within the Gwynns Fall
Watershed (Baltimore Ecosystem Study LTER, Maryland).
Land use classification . The NAWQA sites were classified based
on the percentage of major land use according to the USGS criteria
(Spahr et al., 2010). Agricultural sites have > 50%
agricultural land and <= 5% urban land; urban sites have
> 25% urban and <= 25% agricultural land;
undeveloped sites have <= 5% urban and <= 25%
agricultural land; all other combinations of urban, agricultural, and
undeveloped land are classified as mixed. Land use in the WQW sites was
compiled from multiple sources including EPA water quality reports, USGS
watershed management plan, USDA watershed assessment report, and the
National Land Cover Database (Hyer et al., 2016; Machesky et al., 2010;
Tedesco et al., 2005). Out of the 228 sites, the numbers of Agriculture,
Mixed, Undeveloped, and Urban sites are 61, 78, 49, and 40,
respectively.
Concentration-discharge (C-Q) data analysis and estimation of
shallow and deep water concentrations. A variety of metrics have been
proposed for characterizing export regimes, including load-discharge
relationship, temporal inequality of load and discharge, and various
forms of concentration and discharge relationships (Basu et al., 2010;
Jawitz & Mitchell, 2011; Thompson et al., 2011; Zhang, 2018). Here we
used the power law form of \(C=aQ^{b}\), a common form for cross-site
comparison (Godsey et al., 2009; Herndon et al., 2015). Here \(C\) is
the nitrate concentration, \(Q\) is discharge, and \(a\) and \(b\) are
fitting parameters with b quantifying the C-Q slopes. For the
NAWQA sites with monthly data, C-Q slopes were calculated using all
available paired daily data over multiple years. Duplicate and
incomplete records and concentrations below the Method Detection
Limit (MDL) of 0.01 mg/L were excluded for C-Q fitting (Nolan and
Stoner, 2000). Flow-weighted stream concentrations
(Cstream in Figures 2 and 3) were calculated by dividing
total nitrate load by total discharge for the entire data period.
The WQW sites have data up to 15 min frequency for multiple years (2-4
years typically). High-frequency data were aggregated to daily means of
discharge and nitrate concentrations. They were then used to calculate
the slope \(b\) for each year so that the b values were at
consistent time scale as those from NAWQA. The mean and standard
deviation of slope b are reported in Figure 5 and 6. In addition,
these high-frequency stream data were used to infer end-member
concentrations in shallow and deep waters. Shallow soil water often
dominates at high flow conditions, whereas deeper base flow predominates
in low flow conditions. Based on this, deep water concentrations
(Cdw) were estimated as the average of stream
concentrations during low flows at the 5th percentile
of annual discharge. Shallow water concentrations (Csw)
were estimated as the average of stream concentrations during high flows
(95th percentile of annual discharge). The
concentration ratio (Cratio) was calculated as
Cratio = Csw / Cdw.
Validation of shallow and deep water chemistry estimation.For shallow water concentrations,
we used a recent study that compiled measured stream water and soil
water concentration from 94 sites with diverse land cover (e.g.,
agriculture, forest, disturbed forest, grassland, urban) across the
globe (Sudduth et al., 2013). To be consistent in land use
classification, we grouped “forest”, “disturbed forest”, and
“grassland” in Sudduth et al. (2013) as “undeveloped” in this work.
With limited measurements of Csw, we cannot directly
compare estimated and measured Csw in individual sites.
Instead, we examined whether the relationship between estimated
Csw and Cstream from this study is close
to the derived relationship between measured Csw and
Cstream from Sudduth et al. (2013). The relationships
would be similar if the end-member estimation could provide a realistic
estimation of Csw (Figure 4a).
For deep water chemistry, we used
the National Ground-Water Monitoring Network (NGWMN), a compiled
database for groundwater level and water quality
(https://cida.usgs.gov/ngwmn/). Based on well location and data
availability, groundwater nitrate data were retrieved from the Midwest
(i.e., Minnesota, Wisconsin, Iowa, Nebraska, Illinois), New Jersey, and
Florida for the maximum coverage. Based on the proximity of sites, a
number of sub-regions were selected for comparing estimated vs. measured
Cdw (Figure S5). Within each sub-region, estimated and
measured Cdw were averaged and compared (Figure 4b).
Watershed reactive transport modeling. Watershed modeling was
carried out using a recently developed code BioRT-Flux-PIHM, to explore
the effects of subsurface vertical profile for N (Zhi et al., 2019). A
base model was set up at the Conewago Creek watershed, a tributary to
the Susquehanna River in the Chesapeake Bay. It drains an area of 136
km2 and is dominated by agricultural (47%), forested
(43%), and developed (17%) lands. The base model reproduced the
temporal discharge and the slightly flushing C-Q pattern as observed at
the site (Figure S6 and S7). Monte-Carlo simulations of 500 cases were
carried out to cast the base model results to broader conditions
spanning a range of N vertical distribution and reaction kinetics
representing different land use conditions (Figure 6). We then derived a
general relationship for the dependence of C-Q b slope on the
ratio of shallow versus deep water concentration (Csw /
Cdw). This simple relationship was further verified byb values estimated based on stream water chemistry from 81 sites,
and a few intensively monitored sites with detailed soil and groundwater
measurements. Other details of model setup and calibration, and
Monte-Carlo simulation are in the Supporting Information (SI).