Author(s): V. Gomez-Escalonilla; P. Martinez-Santos; S. Diaz-Alcaide; E. Montero Gonzalez; M. Martin-Loeches
Linked Author(s):
Keywords: Groundwater monitoring; Nitrate; Machine learning; Observation wells; Supervised classification; Spain
Abstract: Machine learning approaches are currently being explored as tools to underpin water management. Applications range from the prediction of groundwater levels to the improvement of classic numerical models. We present an approach to underpin the design of groundwater quality monitoring networks based on the application of multiple supervised classification algorithms. The method is illustrated through its application to a series of aquifer systems in central Spain. Classifiers were trained on a sample of borehole data to identify which spatially-distributed variables explain the presence of selected contaminants in groundwater. Spatially-distributed explanatory variables included slope, thickness of the unsaturated zone, lithology, and land use, among others. The best performing algorithms as per AUC, test score, precision and recall metrics, were used subsequently to identify unmonitored locations where contamination is likely to occur. This results in the identification of those spatially-distributed factors that seemingly explain the presence of contamination in groundwater. Furthermore, it allowed us to identify the unmonitored areas of potential concern where new observation points would be needed. This method provides an alternative to expert-based criteria as to where to site new groundwater monitoring stations and can be readily exported to other settings. Tree-based classifiers such as random forest and extra trees proved the most accurate predictors of groundwater contamination, rendering predictive and AUC scores in excess of 0.8.
DOI: https://doi.org/10.3850/iahr-hic2483430201-3
Year: 2024