Results

A genetic algorithm (GA) is used to optimize the three scaling factors for the RDII impulse response functions (IRFs). The same method was used to calibrate the total sewer flow simulated by the SWMM RTK method for a comparison. The efficiency of both RDII estimation methods is compared using the modified Nash-Sutcliffe coefficient.
\(E_{j}=1-\frac{\sum_{t=1}^{T}{W_{t,j}{(Q_{0}^{t}-Q_{m}^{t})}^{2}}}{\sum_{t=1}^{T}{W_{t,j}{(Q_{0}^{t}-{Q_{0}})}^{2}}}\) (15)
where \(Q_{0}^{t}\) is observed discharge at time t [T],\(Q_{m}^{t}\) is modeled discharge at time t[L3T-1], and \({Q_{0}}\) is the average of observed discharge [L3T-1]. The coefficient ranges from -∞ to 1 and E = 1 corresponds to a perfect match between the observed discharge and the modeled discharge.j is a weighting factor (j = 1, 2, and 3).Wj is a weighting factor with the index = 1 is applied to low flows, j = 2 is applied to medium flows, and j = 3 is applied to peak flow values. In the conventional Nash-Sutcliffe method, all three weighting factors are identical (W1 = W2W3). By using the modified Nash-Sutcliffe method, smaller runoff values are under-emphasized and larger peaks are over-emphasized.
The calibration period was from May 9, 2009 to June 7, 2009 and the validation period was from June 9, 2009 to July 8, 2009. The IRF method has three parameters to calibrate: roof connection scaling factor (R), sump pump connection scaling factor (S), and leaky lateral scaling factor (L). The RTK method has nine parameters to calibrate: R1, R2, R3, T1, T2, T3, K1, K2, and K3. R is a ratio of I&I discharge volume to the rainfall volume: R1 is for a fast inflow element, while R2 and R3 represent slower infiltration elements. T is the time to peak in each hydrograph (typically expressed in hours), and K is the ratio of time of recession to the time to peak.
For the GA optimization conditions, size of the population was set as 100 and the maximum number of generations was set as 300 for both models approaches. Value 0.95 is selected as the probability of crossover for both IRF and RTK calibration. The probability of mutation is set as 0.06.
The calibrated parameter solutions for the IRF and RTK methods are presented in Table 1. The Nash-Sutcliffe model efficiency coefficient of the IRF solution is 0.534 in the calibration period and 0.560 in the validation period. The modified Nash-Sutcliffe coefficients for the IRF solution were 0.892 for the calibration period and 0.866 for the validation period when the Nash-Sutcliffe weighting factors were set as W1 = 3 for Q > 90-th percentile, W2 = 2 for 80- < < 90-th percentile, W3 = 1 for < 80-th percentile. Assigning larger weighting factors for high flows improved the model fit significantly. The Nash-Sutcliffe coefficient of the best RTK solution was 0.848 in the calibration period and 0.795 in the validation period.
Though the model fitness was improved by using the modified Nash-Sutcliffe method, model efficiency based on the RTK method was higher since the RTK method has three times more parameters to adjust, nine instead of three parameters. However, in the validation period, model efficiency was increased for the IRF solution while it was decreased for the RTK solution. This may imply the pitfall of the RTK method that the method is not consistent and may not be very robust.
The optimal solution of the IRF scaling factors using the GA is: R = 3,359 for roof, S = 22,653 for sump pump, and L = 19,985 for lateral. These values can be interpreted as RDII volume contribution of each RDII source (Table 1). Contributing flow volume of each RDII source is derived by multiplying the per-unit-area flow volume of IRFs and the IRF weighting coefficients (Table 2). Then the contributing RDII volume from the roof, sump pump, and lateral become 9,710 m3, 22,653 m3, and 32,543 m3, respectively, and they are 15%, 35%, and 50% of total estimated RDII flow volume. This simple calculation shows that IRF result can be interpreted as RDII volume contribution of different RDII sources, which shows the most problematic RDII contributor in the system volume-wise. These values need to be interpreted with a caution as the IRF model application in this study is only one realization of a real system and each sewershed is unique in terms of factors that contribute to RDII. However, this result still can provide insights to RDII behavior of the system by providing physical meaning of the solutions.
The IRF approach tends to be more robust because three parameters adjust three IRF that represent processes based on physics. Each IRF shape is defined independently using physics-based models and the weighting parameters reflect the contribution from each of the three IRF. The IRF solutions are a unique solution no matter how randomly the initial population was selected. In contrast, RTK method gives different solutions every time the model runs. As an example, 30 sets of three RTK hydrograph solutions display widely variable results as presented in Figure 5. Within the user specified range for each hydrograph, the solution can be vastly different for each run. The Nash-Sutcliffe coefficient of the best case was 0.848 and that of the worst case was 0.681. Depending on the user-specified ranges of each parameter, the results can vastly differ and the performance is not guaranteed.
RTK method has many local optimal solutions, which indicates that nine coefficients are not independent. Thus the starting points or constraints of the parameters cause other parameters to adjust to obtain a local optimum that behaves similarly good for calibration data. Box plots of the nine RTK parameters from the 30 model runs are presented in Figure 6. Greater variability is observed in RTK parameters for the second and third triangular hydrographs, especially the third one. This is because the model tries to adjust these parameters according to the given constraints of earlier parameters. Technically, different RTK local solutions can result in the same model fitness. Change in one hydrograph affects other two hydrographs to simply achieve the best fitness. This indicates the problem of the RTK method that physical processes are not reflected in the modeling.
Figure 7 shows the prediction of the monitored flow hydrograph using the IRF solution and the best case of the RTK solutions during the calibration period (Figure 7(a)) and the validation period (Figure 7(b)). In June 24, both methods predict flow peaks but the peak is not observed in the monitored flow record. The flow peak might have happened in such a short time period and the flow monitor might have failed to capture the peak. Overall, RTK method tends to follow the monitored hydrograph well especially at the falling limbs of peaks while IRF tends to underestimate the flow at the falling limbs.
The volume and the peak flow values for the estimated DWF, observed sewer flow, IRF model result, and RTK model result are summarized in Table 3. Flowrate 0.3 m3/s is selected to define the beginning and the end of each storm. The observed sewer flow, IRF results, and RTK results are compared to the estimated DWF using the following equation.
\(\text{Compare\ to\ DWF}=\frac{\text{Observed\ sewer}}{\text{Estimated\ DWF}}\times 100\)(16)
The observed sewer flow is three to four times of DWF in volume and three to six times in peaks during the storms. Considering the monitoring location is sanitary only, a great deal of RDII exists in the area.
The IRF result and RTK result are compared to the observed sewer flow using the following equation.
\(\text{Compare to observed RDII}=\frac{\text{Predicted RDII}- \text{Observed RDII}}{\text{Observed RDII}}\times100\)(17)
Both models underestimated the flow volume; IRF method underestimates flow volume by 9% to 28% and RTK method underestimates flow volume by 4% to 26% compare to monitoring volume. In terms of flow peaks, IRF method overestimated peak flowrate for May 13, May 27, and June 11 storms by 19%, 25%, and 9%, respectively. At the same time IRF method underestimated peak flowrate for May 15, and June 16 by 15% and 8%, respectively. RTK method overestimated peak flowrate consistently from 1% to 16%.
Residual plots of the IRF and the best RTK solutions for the calibration period and the validation period are presented in Figure 8. Residuals are the difference between the observed value of the dependent variable and the predicted value. Each data point has one residual and is defined with the following equation.
Residual = Observed value – Predicted value (18)
Residuals are plotted against the observed value in the x axis. There are clusters of points at low flowrate, which represent tails in the hydrographs. In Figure 8(a), IRF underestimates the peaks as most of the residuals are in the positive side. These points are from the storms in May 15, 2009 and May 27, 2009. This trend is also observed in the validation period and the outliers are from the storms in June 11, 2009 and June 16, 2009 (Figure 8(b)). In validation period, RTK also underestimated peaks as most of high flow points are in the positive side. This means the best RTK solution for the calibration period loses the efficiency in the validation period. This explains the decrease of Nash-Sutcliffe coefficient of RTK method in the validation period as presented in Table 1 and supports that RTK method is more of a curve fitting method with limited physical meaning.