Author(s): Si Yoon Kwon, Il Won Seo, Hyoseob Noh
Linked Author(s):
Keywords: Source identification; Solute transport; Transient storage; River system; Machine learning;
Abstract: In this study, data-driven models with Machine Learning (ML) approach were developed to identify the pollution source characteristics in the river network system. In order to identify the unknown pollution source characteristics, three supervised machine learning models using multiclass classification and regression were evaluated: Random Forest (RF), Support Vector Machine (SVM), Deep Neural Network (DNN). While the multiclass classification was employed for identification of source location of pollution along a river, the regression was used to estimate the mass flux of spill pollutant. On the other hand, due to the lack of pollutant accident data in natural rivers, the river network system model with a transient storage zone model (TSM) was developed to formulate pollution spill accidents scenarios. Through the pre-simulated scenarios, the breakthrough curves (BTC) parameters and corresponding geographic information were served as training data-sets of the proposed models. As a result, RF classifier was more practical than DNN with two hidden layers and more accurate than SVM using radial basis as spill location predictor. Subsequently, the spill mass flux prediction from the predicted spill location was estimated by regression DNN model, and the prediction results showed small error performance.
DOI: https://doi.org/10.3850/38WC092019-0439
Year: 2019