Author(s): Karen Schulz; Andre Niemann; Thorsten Mietzel
Linked Author(s):
Keywords: Data correction; Imbalanced regression; Low-cost; Machine learning; Precipitation; Rain gauge
Abstract: Classical rain gauge networks are too sparsely distributed to accurately capture the high spatial and temporal variability of precipitation. Small-scale usable and inexpensive sensors (low-cost sensors) make a contribution for a more detailed recording at the expense of a poorer data quality. We therefore suggest a framework to correct precipitation data recorded by rain gauges in real-time. Precipitation is highly unbalanced and has a right skew. But extreme events are no less important. We present analyses for common ML (machine learning) models with adjustments to unbalanced data and compare them to baseline and statistical models. It was found that error correction for classical sensors is only possible in a rudimentary way by correcting missing values in this type of setup. In the low-cost data correction, errors were reduced by 32 % to 55 % in comparison to baseline models results. We propose using a hybrid approach to account for the unbalanced precipitation using either preprocessing or model training methods in combination with postprocessing of model results.
DOI: https://doi.org/10.3850/iahr-hic2483430201-380
Year: 2024