As the second phase of the study, we have used two data sets delivered from the irrigation department of Sri Lanka.
Data set one - Kalu river water level with a sampling interval is one hour.
Data set two - Kalu river water level along with six sub rivers (Denawaka, Galathure, Kukule, Kuru, Niriella and wey) with a sampling interval of one minute.
For the first data set, the predictions were done using regression analysis and time series analysis.
For the first data set, the predictions were done using regression analysis and Neural networks.
No matter which method from regression analysis, time series analysis or analysis using neural networks, the error present in predicted water level compared with the actual water level in the river cannot be eliminated. Therefore, when forecasting floods based on these results, should count the error as well. This is vital when water level is closer to the flood level.
When issuing warnings considering the error margin, priority should be given to prevent false negative flood warnings at all cost since human lives and properties are at risk. However, an increase in false positive warnings may decrease the trust in the warning system. Different levels of warnings should be proposed considering confidence of water level and error of water level.
RMSE of multivariable regression model using both the data sets shows that the error is increased as obvious when the number of hours of prediction increases. Considering the time series analysis, the different models give different error values. from those, ARIMA(2,1,2) models gave the best results. Considering the Neural network, even if it works in a very accurate way for the days which are closer to each other, when a data set is distant to the training set, the prediction contains a lot of errors. This is caused by the fact that the neural network is nor generalized because of the lack of data in the training set.
As a future development, By using the most accurate models, a web interface is to be created so that the predicted water level curve can be seen ahead of the current water level values. (This is planned to be done after some time using the multivariable data set which includes the readings of the sub rivers as well. Because this system has been implemented recently (January 2021), there is not enough data to generalize the situations to a greater extent. Especially when it comes to the weather related problems, as a rule of thumb, at least the data set should contain one year of data which actually covers all the seasonality ).