Abstract:
Water flow forecasting is expected to be achieved to overcome water disasters in Sudan. So, it is imperative in hydrology to allow accurate evaluation in water budget, floods erosion, and even for local river navigation. Daily Flow Forecasting of the Blue Nile is expected to be achieved using real data sets of river flow and weather parameters from metrological station Sudan .In this research ,five models for forecasting water flow in Blue Nile using Artificial Neural Network, Support Vector Machine, and Markov Chain are built. An ensemble model for forecasting water flow periodically will be built and compared to the results of each model separately. Single models usually give predictions that do not consider all phenomena or events. Ensemble modeling gives better accuracy than single classifiers. In this research, real data was collected from the metrological stations (Soba and Eldeim) and the Ministry of Irrigation for the years 2003 until 2015. It includes the daily data of river flow, level, discharge, relative humidity, Sunshine Duration (SSD), rainfall, temperature maximum, temperature minimum, pressure, wind speed, and wind direction from the ground measurement. This data was used for building an ensemble model predicting the flow of the Blue Nile using three different algorithms. These algorithms, which are (Artificial Neural Network, Support vector regression, and Markov chain), were trained and applied separately for the prediction of flow. The results were compared, showing that the Markov chain gives the best accuracy for predicting river flow in the Blue Nile. Although tested on the Blue Nile, the models should apply to other rivers provided the parameters are also derived from the statistics for those rivers.
Two ensemble techniques, which were voting and bagging, were implemented. The results showed that using ensemble models with bagging and voting improved the accuracy of prediction. Also, the analysis indicated that bagging gives better accuracy than voting.. The software used in this research is the R language, and Rapid Miner. It is concluded that it is difficult to determine the best algorithm to be used in a specific application. The only way to solve this problem is by trying many algorithms to find
V
which one is better. This research focused attention on the importance of selecting the right data or algorithm before using a particular modeling technique. The performance is compared using the correlation coefficient and accuracy.