2.2. Statistical Methods
Rainfall depths or runoff discharges in prescribed time durations are
statistically analyzed in the conventional hydrological studies.
However, the statistical analysis here is conducted for the clearly
distinguishable flash flood events, considering the duration as another
variable. Namely, the duration , the total rainfall depth at the
observation system, the total rainfall depth at the auxiliary raingauge,
the total runoff volume , the maximum runoff discharge , and the bulk
runoff coefficient of each flash flood event are considered as six
variables to be analyzed. The duration of a flash flood event is defined
as the length of the uninterrupted period, where both rainfall and
runoff do not become zero simultaneously. A threshold of the total
runoff volumes is set as 100 m3, and flash flood
events yielding less than that threshold are discarded.
Firstly, Pearson’s correlation coefficients and Spearman’s rank
correlation coefficients are calculated for all combinations of the six
variables. Then, probability distribution fitting is performed for each
variable, using EasyFit software, which deals with 23 types of
probability distributions. However, we focus on LN3 and GEV, both of
which are well fit to variables taking positive values to represent
extreme phenomena like flash floods. The cumulative distribution
function (CDF) of the LN3 is given by
with three parameters , , and for the generic random variable , where
erf represents the Gauss error function, while that of the GEV is given
by
with three parameters , , and for the generic random variable . The
Kolmogorov-Smirnov (K-S) test is applied to examining whether each of
the six variables fits to each of the two probability distributions or
not. The empirical cumulative distribution function (ECDF) for data setE of the observed values of a generic variable, sorted in
ascending order, is given by
where is the number of observations in the data set E , and is the
indicator function, which is equal to 1 if and equal to 0 otherwise. The
K-S statistic for the ECDF of a data sets and a given CDF is calculated
as
which must satisfy the inequality
for a criterion , to reject, at a significance level , the null
hypothesis that the observed values in the data set are drawn from the
given distribution. The criterion solves the equation
.
Note that there is a bug in EasyFit software in calculating the K-S
statistic , and therefore another program is developed for that purpose
using C++ language (Vetterling et al. 1999).