5.0 Filtering off Data with Large Relative Error
Data cleaning has played a critical role in ensuring data quality for industrial applications. With the increasing prevalence of data-centric approaches to business and scientific problems with data as a crucial asset, data cleaning has become even more important (Abedjan et al. 2016).
As can be seen in preceding Table 1 and Table 2, the results from both LLS regression method and 2D interpolative model has failed to reach the accuracy below 1% error, moreover, it does not improve on the standard deviation of the relative errors. As reviewed in preceding section 2.2, the improvement of mean relative error can be associated with the improvement of the systematic error (or bias). However, the failure in reducing the standard deviation of relative error is believed to be the inherent nature of random errors.
As depicted in Fig. 11, it was clearly shown that standard deviation can increased drastically due to the presence of random errors. Yazdanshenashad et al. (2018) believed the random errors cannot be reduced by flow meter recalibration. They suggested to reduce the errors by averaging large number of data.