4.5 Future Consideration
ERA5 reanalysis studies are often hindered by a similar set of obstacles, such as complex terrain and a lack of in situ observations (Gleixner et al., 2020). And in the case of the ERA5 model itself, its resolution value of 0.25 degrees is considered too coarse for small-scale regional modelling and impact models (Gleixner et al., 2020) (though its land-only counterpart, ERA5-Land, is often used instead to counteract this limitation (Gleixner et al., 2020)). Nevertheless, ERA5 is widely agreed to be a vast improvement upon its predecessor, the ERA-interim dataset, on the grounds of precipitation measurements, as well as those of temperature (Gleixner et al., 2020). This will ultimately prove essential when observational values are needed in conjunction with multiple climate variables in order to, for example, model the natural variability of coupled systems (Trenberth et al., 2008). Whether or not any improvements in ERA5 will prove significant will depend on the outcome of future studies, which often test such newfound capabilities in regions whose climate is difficult to analyse, e.g. East Africa (Gleixner et al., 2020), which features complex terrain and frequently heavy cloud cover (Holmes et al., 2016) in addition to a sparsity of in situ measurements (Gleixner et al., 2020). A wealth of advantages obtained in any reanalysis study therefore allows for additional statistical experimentation to be performed, as is the case with our study, in which sufficient data was made available for the assimilation of random variation of surface temperature in our calculations. Due to this, we can state with more confidence that shifts in station location remain one of the most likely sources of error or bias in the data. Though another method to consider is one suggested by Almeida and Coelho (2023), involving the simulation of different climatic conditions in a study area to eliminate further uncertainties. In the case of this study, it may have proved useful in identifying further potential sources of skew in location correlation data.
5. Conclusion
Climate data error is uncertain, mainly due to the limitation of detection radar technology, the influence of extreme climate, local terrain factors and the systematic error of ERA5 model (reference?). Error reduction is essential for reliable meteorological and climate data, since any systematic error can lead to a misinterpretation of climate change.
Therefore, in this study, in order to identify and correct previous errors in the data and improve the accuracy of the climate data, we used the ERA5S dataset reanalysis field to verify the most likely location of the site in question, and performed other logical location checks on all sites, which could significantly reduce the error caused by incorrect changes in the site location. To provide more reliable and accurate climate data for climate change predictions and climate models.
Our research method selected Frankfurt, Germany, as the sample area, and divided the city into 25 different locations. Our data were collected from September 2013 to September 2014, at a height of 2 meters per hour between latitude 49-50° and longitude 8-9°. The terrain in this area is flat, which can reduce the error caused by terrain factors to a certain extent.
We use the Shapiro-Wilks test of the r studio tool to verify the null hypothesis that the data is generally distributed. A P-value is a statistical measure of whether an observed statistical difference or effect is significant. When p<0.05, the data is found to be non-normal. Therefore, the alternative hypothesis (that is, the data is not normally distributed) is accepted at each location, while the null hypothesis is rejected. According to the data analysis and image results of the Frankfurt sample area, we have 95% confidence that the location of each meteorological station in the sample is accurate; Similarly, when we use the code and detection method to select other specific areas to detect the site location, when p>0.05, it means that the observed results are not statistically significant enough and cannot prove that our null hypothesis is correct, which means that the site location is suspicious.
In future studies, the method presented in this paper can be applied to different data sets and other specific regions to perform other logical location checks on all sites, thereby reducing errors due to site location and improving the accuracy of climate data. The method can also be used for other data types, such as satellite data or gridded data from models or reanalyses. The advantage of this approach is that it can be applied to a variety of data sets regardless of statistical distribution. In addition, its application is straightforward and does not require any complex statistical transformations. However, this simplicity also means that there are some shortcomings in some statistical details, in relatively complex terrain, or in extreme weather conditions, correlation detection we cannot determine whether location suspicious is associated with these events; Our detection method is relatively suitable for a region with a specific range size, and the detection time period is long. In the future, we need more advanced statistical methods for dependence, for example, the use of multivariate statistics, more accurate climate models to improve the accuracy and efficiency of detection.