5.0 Filtering off Data with Large Relative Error
Data cleaning has played a critical role in ensuring data quality for
industrial applications. With the increasing prevalence of data-centric
approaches to business and scientific problems with data as a crucial
asset, data cleaning has become even more important (Abedjan et al.
2016).
As can be seen in preceding Table 1 and Table 2, the results from both
LLS regression method and 2D interpolative model has failed to reach the
accuracy below 1% error, moreover, it does not improve on the standard
deviation of the relative errors. As reviewed in preceding section 2.2,
the improvement of mean relative error can be associated with the
improvement of the systematic error (or bias). However, the
failure in reducing the standard deviation of relative error is believed
to be the inherent nature of random errors.
As depicted in Fig. 11, it was clearly shown that standard deviation can
increased drastically due to the presence of random errors.
Yazdanshenashad et al. (2018) believed the random errors cannot be
reduced by flow meter recalibration. They suggested to reduce the errors
by averaging large number of data.