Results
A genetic algorithm (GA) is used to optimize the three scaling factors
for the RDII impulse response functions (IRFs). The same method was used
to calibrate the total sewer flow simulated by the SWMM RTK method for a
comparison. The efficiency of both RDII estimation methods is compared
using the modified Nash-Sutcliffe coefficient.
\(E_{j}=1-\frac{\sum_{t=1}^{T}{W_{t,j}{(Q_{0}^{t}-Q_{m}^{t})}^{2}}}{\sum_{t=1}^{T}{W_{t,j}{(Q_{0}^{t}-{Q_{0}})}^{2}}}\)(15)
where \(Q_{0}^{t}\) is observed discharge at time t [T],\(Q_{m}^{t}\) is modeled discharge at time t[L3/T], and \(Q_{0}\) is
the average of observed discharge [L3/T]. The
coefficient ranges from -∞ to 1 and E = 1 corresponds to a
perfect match between the observed discharge and the modeled discharge.j is a weighting factor (j = 1, 2, and 3).Wj is a weighting factor with the index j= 1 is applied to low flows, j = 2 is applied to medium flows,
and j = 3 is applied to peak flow values. In the conventional
Nash-Sutcliffe method, all three weighting factors are identical
(W1 = W2 =W3). In this study, weighting factors are
adjusted so that the larger RDII peaks are emphasized as RDII only
occurs during storm events.
The calibration period was from May 9, 2009 to June 7, 2009 and the
validation period was from June 9, 2009 to July 8, 2009. The IRF method
has three parameters to calibrate: roof connection scaling factor (R),
sump pump connection scaling factor (S), and leaky lateral scaling
factor (L). The RTK method has nine parameters to calibrate: R1, R2, R3,
T1, T2, T3, K1, K2, and K3. R is a ratio of I&I discharge volume to the
rainfall volume, T is the time to peak in each hydrograph (typically
expressed in hours), and K is the ratio of time of recession to the time
to peak. Numeric indicators 1, 2, and 3 are associated RDII elements: 1
is for a fast inflow element, 2 is for a medium infiltration element,
and 3 is for a slow infiltration element, respectively.
For the GA optimization conditions, size of the population was set as
100 and the maximum number of generations was set as 300 for both models
approaches. Value 0.95 is selected as the probability of crossover for
both IRF and RTK calibration. The probability of mutation is set as
0.06.
The calibrated parameter solutions for the IRF and RTK methods are
presented in Table 1. The Nash-Sutcliffe model efficiency coefficient of
the IRF solution is 0.534 in the calibration period and 0.560 in the
validation period. The modified Nash-Sutcliffe coefficients for the IRF
solution were 0.892 for the calibration period and 0.866 for the
validation period when the Nash-Sutcliffe weighting factors were set asW1 = 3 for Q > 90-th
percentile, W2 = 2 for 80- < Q< 90-th percentile, W3 = 1 for Q< 80-th percentile. By using the modified Nash-Sutcliffe
method, smaller runoff values are under-emphasized and larger peaks are
over-emphasized which in turn improved the model efficiency coefficient.
The Nash-Sutcliffe coefficient of the best RTK solution was 0.848 in the
calibration period and 0.795 in the validation period.
Though the model fitness was improved by using the modified
Nash-Sutcliffe method, model efficiency based on the RTK method was
higher since the RTK method has three times more parameters to adjust,
nine instead of three parameters. However, in the validation period,
model efficiency was increased for the IRF solution while it was
decreased for the RTK solution. This may imply the pitfall of the RTK
method that the method is not consistent and may not be very robust.
The optimal solution of the IRF scaling factors using the GA is: R =
3,359 for roof, S = 22,653 for sump pump, and L = 19,985 for lateral.
These values can be interpreted as RDII volume contribution of each RDII
source (Table 2). Contributing flow volume of each RDII source is
derived by multiplying the per-unit-area flow volume of IRFs and the IRF
weighting coefficients. Then the contributing RDII volume from the roof,
sump pump, and lateral become 9,710 m3, 22,653
m3, and 32,543 m3, respectively, and
they are 15%, 35%, and 50% of total estimated RDII flow volume. This
simple calculation shows that IRF result can be interpreted as RDII
volume contribution of different RDII sources, which shows the most
problematic RDII contributor in the system volume-wise. These values
need to be interpreted with a caution as the IRF model application in
this study is only one realization of a real system and each sewershed
is unique in terms of factors that contribute to RDII. However, this
result still can provide insights to RDII behavior of the system by
providing physical meaning of the solutions.
The IRF approach tends to be more robust because each IRF shape is
defined independently using physics-based models and the weighting
parameters reflect the contribution from each of the three IRF. The IRF
solutions are unique no matter how randomly the initial population was
selected. In contrast, RTK method gives different solutions every time
the model runs. As an example, 30 sets of three RTK hydrograph solutions
display widely variable results as presented in Figure 5. Within the
user specified range for each hydrograph, the solution can be vastly
different for each run. The Nash-Sutcliffe coefficient of the best case
was 0.848 and that of the worst case was 0.681. RTK method has many
local optimal solutions, which indicates that nine coefficients are not
independent. Thus the starting points or constraints of the parameters
cause other parameters to adjust to obtain a local optimum that behaves
similarly good for calibration data.
Box plots of the nine RTK parameters from the 30 model runs are
presented in Figure 6. Greater variability is observed in RTK parameters
for the second and third triangular hydrographs, especially the third
one. This is because the model tries to adjust these parameters
according to the given constraints of the parameters that are set
beforehand. Technically, different RTK local solutions can result in the
same model fitness. Change in one hydrograph affects other two
hydrographs to simply achieve the best fit. This indicates the problem
of the RTK method that physical processes are not reflected in the
modeling.
Figure 7 shows the prediction of the monitored flow hydrograph using the
IRF solution and the best case of the RTK solutions during the
calibration period (Figure 7(a)) and the validation period (Figure
7(b)). Overall, RTK method tends to follow the monitored hydrograph well
while IRF tends to underestimate the flow at the falling limbs.
The volume and the peak flow values for the estimated DWF, observed
sewer flow, IRF model result, and RTK model result are summarized in
Table 3. Flowrate 0.3 m3/s is selected as a cutoff
value to define the beginning and the end of each storm. The observed
sewer flow is compared to the estimated DWF using the following
equation.
\(Compare\ to\ DWF=\frac{\text{Observed\ sewer}}{\text{Estimated\ DWF}}\times 100\)(16)
The observed sewer flow is three to four times of DWF in volume and
three to six times in peaks during the storms. Considering the
monitoring location is sanitary only, a great deal of RDII exists in the
area.
The IRF result and RTK result are compared to the observed sewer flow
using the following equation.
\(Compare\ to\ observed\ RDII=\frac{Predicted\ RDII-Observed\ RDII}{\text{Observed\ RDII}}\times 100\)(17)
Both models underestimated the flow volume; IRF method underestimates
flow volume by 9% to 28% and RTK method underestimates flow volume by
4% to 26% compare to monitoring volume. In terms of flow peaks, IRF
method overestimated peak flowrate for May 13, May 27, and June 11
storms by 19%, 25%, and 9%, respectively. At the same time IRF method
underestimated peak flowrate for May 15, and June 16 by 15% and 8%,
respectively. RTK method overestimated peak flowrate consistently from
1% to 16%.
Residual plots of the IRF and the best RTK solutions for the calibration
period and the validation period are presented in Figure 8. Residuals
are the difference between the observed value of the dependent variable
and the predicted value. Each data point has one residual and is defined
with the following equation.
Residual = Observed value – Predicted value (18)
Residuals are plotted against the observed value in the x axis.
There are clusters of points at low flowrate, which represent tails in
the hydrographs. In Figure 8(a), IRF underestimates the peaks as most of
the residuals are in the positive side. These points are from the storms
in May 15, 2009 and May 27, 2009. This trend is also observed in the
validation period and the outliers are from the storms in June 11, 2009
and June 16, 2009 (Figure 8(b)). In validation period, RTK also
underestimated peaks as most of high flow points are in the positive
side. This means the best RTK solution for the calibration period loses
the efficiency in the validation period. This explains the decrease of
Nash-Sutcliffe coefficient of RTK method in the validation period as
presented in Table 1 and supports that RTK method is more of a curve
fitting method with limited physical meaning.