3 Methodology

3.1 Spatial data

We incorporated spatial analysis using Geological Information System (GIS) techniques and developed a code (SWATpy) in Python programming language to semi-automate and perform the analyses. The code is under development to fully automate the entire modeling process. Python 3 and ArcSWAT version 2012.10.19 (SWAT v2012, rev.667) were used. Soil, Land use, and elevation data (Digital Elevation Model (DEM)) are in a gridded format. US soil database includes two types of soil data: State Soil Geographic Database (STATSGO) and gridded Soil Survey Geographic Database (gSSURGO) (SoilSurvey 2019). The former is in built in SWAT soil database and the later were derived from United States Department of Agriculture (USDA) (Winchell, Srinivasan et al. 2013). gSSURGO was selected for the modeling due to greater detail and better performance (SoilSurvey 2019). The UCS soil map has more than 250 soil classes. The elevation and land use data sets were derived from US Geological Survey (USGS) and Multi-Resolution Land Characteristics (MRLC) Consortiums, respectively. The Digital Elevation Model (DEM) of 1/3 arc-second (approximately 10m resolution for the study area) was used. It is a 3DEP (3D Elevation Program) map (U.S.GeologicalSurvey 2017). The data extent is 1*1 degree. The land use data set is NLCD2016 in 30 m resolution (Yang, Jin et al. 2018); and the corresponding 2001-2006 lookup table was used.

3.2 Hydro-meteorological data

SWAT uses 5 type of daily weather data as input: precipitation, temperature, solar radiation, wind speed, and humidity (Neitsch, Arnold et al. 2011). For Precipitation and temperature, measured data from weather stations were used. For other climate variables, we used the model’s Weather Generator (Neitsch, Arnold et al. 2011). The generated variables are created using statistical data calculated from monthly average values for the climatic variables within the SWAT weather databases (Neitsch, Arnold et al. 2011; Arnold, Kiniry et al. 2013). The data used in this study are daily and span from 1998 to 2013.
Big gaps between Land-based weather station (LBWS) can be problematic (Fuka, Walter et al. 2014). Since we had noticeable gaps between some of the LBWS, two weather data were tested: Climate Forecast System Reanalysis (CFSR) from the National Centers for Environmental Predictions (NCEP) and LBWS data. The preliminary result for latter was better. Therefore, the LBWS data were selected. We believe this is due to the courser resolution of the CFSR data comparing to the distances weather stations. The CFSR resolution is 0.5° latitude × 0.5° longitude (Saha, Moorthi et al. 2010). A few studies have done the same analysis and reached the same conclusion (Dile and Srinivasan 2014; Roth and Lemann 2016). Even though, the distribution of the LBWSs was irregular as opposed to gridded distribution of CFSR data, the land-based data performed better. SWAT assigns one weather station data to each subbasin (closest LBWS to the centroid of a subbasin) (Masih, Maskey et al. 2011; Winchell, Srinivasan et al. 2013). There are two methods to connect LBWS to the subbasins: the centroid method and time dynamic Voronoi tessellation (Neitsch, Arnold et al. 2011; Andersson, Zehnder et al. 2012; Winchell, Srinivasan et al. 2013; Tuo, Duan et al. 2016). The former was utilized which had advantages and disadvantages (Cho, Bosch et al. 2009; Galván, Olías et al. 2014; Tuo, Duan et al. 2016). Considering the procedure for assigning gauge station in SWAT, we carried out a sensitivity analysis for our original LBWSs and then three LBWSs were removed due to false allocation of the weather data.
The observed streamflow data were derived from the National Water Information System (NWIS) of U.S Geological Survey (USGS). From four stream sites within the UCS, two of them that had representative location and adequate data were selected. The selected sites (Newton (NW) (USGS2361000) and Bellwood (BL) (USGS2361500)) have mean daily discharge data since 1921/12/01. The observed discharge data used for this study spans from 1998 to 2013.