Balancing Between Generalization and over-Fitting: ANN-based Modeling for Flow Forecasting

IAHR Document Library

« Back to Library Homepage « Proceedings of the 2nd Symposium on Flood Defence (Beijing, ...

Balancing Between Generalization and over-Fitting: ANN-based Modeling for Flow Forecasting

Download

Author(s): X. H. Dong; C. B. Vreugdenhil

Linked Author(s):

Keywords: Flood forecasting; Artificial neural networks; Generalization; Over-fitting

Abstract: In the application of Artificial Neural Networks (ANNs) based models to the practices of flood forecasting, the problem is always encountered on how to find balance between generalization and over-fitting. The model should be general enough to capture all the necessary information inherently included in the data to explicitly reveal the input-output relationship. In order to satisfy this purpose, the structure of the model should be sophisticated enough (while not too much), the training algorithm be selected carefully and the training course be stopped at the right moment. Some methodologies dealing with how to create appropriate ANN-based models to overcome over-fitting while preserve generalization is presented in this paper. Multi-layer Feed-forward Network (MLFF) and Levenberg-Marquardt (LM) training algorithm were used in this research because they were reported to be the relatively preferred ANN-based measures in undertaking non-linear prediction work, and more or less the quickest and most robust training method respectively. Data selection and preparation methods utilized which are essential for the training course were first described, and then was the selection of model structure, where cross-validation was adopted in determining the appropriate amount of hidden neurons. Then a modified performance function was introduced into LM training algorithm, which aims at enforcing the model parameters to keep small during the training to obtain a smoother output. But even so the network cannot be guaranteed to generalize well unless the training be stopped at right iteration. So finally an early stopping method, which is based on the monitoring of validation subset data was introduced to determine the most appropriate training iteration. The performance of the network was increased step by step by the application of the techniques mentioned above, and quiet satisfied forecasting results which generalizes well was presented at the end.

DOI:

Year: 2002