A rainfall-runoff model using artificial neural networks for the district of Bankura in a time of climate change

Objectives: The main objective of this research is to determine the Hydrological (Rainfall-Runoff) model by using Artificial Neural Networks (ANNs) for the study area.Methods: The ANNsmodel was applied to the relative impact of different climatic variables such as Rainfall, Temperature, Cloud Cover, Potential Evapotranspiration, and RelativeHumidity for the district of Bankura located on Lower Gangetic Plain (Zone no-III) in India. This study has also developed runoff hydrograph using various Slope, Rainfall Intensity, and Roughness over the catchment. The researcher has collected the Real-Time data series of 116 years (1901-2016) for the six meteorological stations of district Bankura from India Meteorological Department, Pune. For estimating runoff values, the study has been used Kothyari and Garde equation in which themost important factor i.e. the Vegetal Cover Factor (Fv) was considered. For developing the ANNs model, the available data were separated as 70% for training, 15% for testing, and 15% for validation. Findings: The Predicted values using ANNsmodel are more useful for better estimation of water resources management than previous researches. The model performance was with better efficiency (Nash-Sutcliffe Efficiency) and it was greater than 97%.Novelty: First time, this research established the Hydrological (Rainfall-Runoff) model by using Artificial Neural Networks (ANNs) for the study area.


Introduction
Rainfall-Runoff plays a major role in Hydrology and Water Resources. ANNs are a physical-based and black box model which is a useful tool in hydrology (1,2) . Nowadays, Artificial Neural Networks (ANNs) have become one of the vital tools for modeling of complex hydrological processes. ANNs draw the relationship among the variables particularly inputs and outputs values (3)(4)(5) .
https://www.indjst.org/ In hydrology, the research work is to develop the model to find out the future basin discharge which is a safe and economical tool for hydrological engineering design (6)(7)(8) . The water resources management using ANNs which involves the estimation of Rainfall-Runoff event, river discharge forecasting, Climate Change, river inflow modeling and estimation of groundwater, etc (9)(10)(11) . The ANNs were developed to an observed daily runoff as a function of daily rainfall, temperature, cloud cover, potential evapotranspiration, and relative humidity. This model is also accurately predicting the basin response to rainfall. The physiological catchment character such as ground slope, vegetal cover factor (land use), and roughness is also considered in this model.
From a review of previous researches, several researchers have investigated and studied the different aspects of this area over the last century, but did not arrive with clear signals. Then, this study has started to research in the district of Bankura.

Site location
The study area is located on Lower Gangetic Plain (Zone no-III, as per Planning Commission of India) in between 22 • 38'00" and 23 • 38'00" N latitude and 86 • 36'00" and 87 • 46'00" E longitude. The area of the district Bankura is 688200 hectares. The district consists of different groundwater potential and water table contour (shown in Figure 1). The river system of the district (shown in Figure 2) is divided into different sub-basin (As per Annual Flood Report, April 2017, Irrigation and Waterways Directorate, Government of West Bengal) such as i) Damodar sub-basin: The catchment area of this basin in West Bengal is 4325 km 2 . Sali is one of the tributaries of Damodar River located in district Bankura. ii) Darakeswar sub-basin: The catchment area of this basin is 4292 km 2 . The tributaries of Darakeswar River are Arkasha, Kansachor, Gangheswari, Berai, Khukra, and Shankari. iii) Shilabati sub-basin: The catchment area of the basin is 4088 km 2 . The tributaries of Shilabati River are Jaiponda, Puratan, Champyan, and Ketia. iv) Kangsabati sub-basin: The catchment area of this basin in West Bengal is 6324 km 2 . The tributaries of Kangsabati River are Jam, Jhinuk, and Kumari. Damodar River is originated from Palamau hill range in Jharkhand State. Darakeswar, Shilabati, and Kangsabati are originated from upland of Chhota Nagpur Plateau in district Purulia, West Bengal.

Meteorological data
For this research, the data have been collected from India Meteorological Department, Pune, for six meteorological station located in the district Bankura such as i) Bankura, ii) Bankura (Central Water Commission), iii) Joypur, iv) Kangsabati dam, v) Ranibandh, vi) Indus. The researches have been considered for predicting runoff by using ANNs model. The following five input variables are Rainfall, Average Temperature, Cloud Cover, Potential Evapotranspiration and Relative Humidity. In the present investigation, the rainfall intensity pattern is used having three different types of intensity i.e. 30mm/hr, 60mm/hr and 90mm/hr.

Physiographical data
For physiographical data, viz. Land Use, Slope and Roughness have been collected from the chief soil survey officer, soil and land use survey of India, Pusa, New Delhi and Natural Resources Data Centre, Bankura, W.B. which is shown in Figure 3.

Land Use details
Land Use details of the district are 59.54 % under cultivation, 12.36% cultivable wasteland, 8.48% barren land, and 19.61% cover with forest. The land utilization pattern of the district reveals that Saltora, Mejia, Gangajalghati, Bankura, Bishnupur, and Patrasayer all have more than 57% of land under cultivation. The Joypur block has the highest cultivation (66%) whereas India has 83.01% highest cultivation. All these blocks are located in the central and eastern parts of the district. In the western and south-western part of the district have a less cultivated area. In Chhatna, Ranibandh and Raipur have more than 30% barren land and other 30% cover by forest. The percentage is covered by the forest of the following blocks -Barjora 23%, Ranibandh 32.87%, Taldangra 31.00%, and Bishnupur 35%. Researches have collected the all land use data from Soil and Land Use Survey of India, Pusa, New Delhi. https://www.indjst.org/

Slope details
The land of the district consists of nine slope categories. The slopes of the district according to their area are as per Table 1. In the present investigation, three different overland slopes of 1%, 2%, and 3% were used.
Where, R m = Annual mean runoff in cm, P m =the average annual rainfall in cm, T m = the average annual temperature in 0 C, and F V = vegetal cover factor.
Where, a 1 , a 2 , a 3 , a 4 are the weighting factors, F F is the percentage area of forest, F G is the percentage area of grass and scrub land, F A is the percentage area of arable land, and F W is the percentage area of waste land only.

Artificial Neural Network (ANN) modelling
In Hydrological research, ANN is a recognized tool to construct a structure between multiple input variables and specific output (12)(13)(14) . This technique is more suitable for forecasting and runoff analysis (15,16) . In this investigation, ANNs consist of five input layers, a single hidden layer, and an output layer. Hidden layers may be increased in case of a complex situation. Firstly, ANNs are trained with a series of observed inputs data set viz. Rainfall, Average Temperature, Cloud Cover, Potential Evapotranspiration, and Relative Humidity, denoted as X 1 , X 2 , X 3 , X 4 , and X 5 respectively and output data, Runoff denoted as Y i . In the training process, the coefficients (denoted as W k and W k j ) are obtained. The first process carries out exploration for the optimum nonlinear correlation among input variables and output. In linear regression, the network involves input variables (X j ) linear functions operated by transfer function as shown in equation (iii). Where, the hidden unit receives from each and every input variable. We are writing mathematically by the following equations: Where, X 1 , X 2 , …… X m are input variables; φ denotes the hyperbolic transfer function; W k1 , W k2 ,… W km and W 1 , W 2 ,… W k are the coefficients (weights) of the network; u 1 , u 2 , ….u k denotes hidden units; b k j and b k are the constants to linear regression, and Y i is the output signal. Hidden units linear function along constant obtains final output of the network as shown in equation (iv). At first, normalization of the data series are performed as per equation (v) within ±1.
In equation (iii), P o , P nor , P min , and P max denotes observed data, normalized data, minimum observed data, and maximum observed data respectively. During ANNs modeling, complete data series were randomly distributed into training, testing, and validation in 70% / 15% /15% format. Initially, the network was trained with training data series, and after that testing, data series was utilized to calibrate the behavior of trained models. After calibration and testing finally, the data series was used to validate and to complete the ANNs model. During the training the network, Gradient Decent (GD) algorithm was applied to reduce the network error by a function minimization routine and to improve the network output.
Overall error E D = Σ (t j − y i ) 2 (vi) Where t j = desired error and y i = calculated output. https://www.indjst.org/ Where . NSE ranges from -∞ at a worst-case to +1 for a perfect correlation. According to Shamseldin (1997), the value of NSE 0.9 and above is very satisfactory, 0.8 to 0.9 represents a fairly good model and 0.8 is an unsatisfactory result (6) .

Root mean squared error (RMSE)
Where, Q k = Observed flow, k = Predicted flow, and K = Total number of year considered (Validation data). The value of RMSE measures the accuracy of the model, and as minimum as possible. The value of RMSE and MAE more or less similar (near about same) and represents the good satisfactory model (C. W. Dawson and R. L. Wilby, 2015).

Mean absolute error (MAE)
Where, Q k = Observed flow, k = Predicted flow, and K = Total number of year considered (Validation data). The value of MAE measures the accuracy of the model and as minimum as possible. https://www.indjst.org/

Error in runoff computation
The error in Runoff for the present investigation was estimated by, Where, Y o = Observed Runoff, and Y K = Predicted Runoff. The value of Error in Runoff computation measures the accuracy of the model and as lower as possible.

Correlation coefficient for linear regression
Where, X i = Observed flow, X = Mean of observed flow, Y i = Predicted flow, Y = Mean of Predicted flow. The values of "r" indicate the association with the two values, ranges from -1 (perfectly negative correlation), through 0 (no correlation) to +1 (strongly positive correlation).

Result and Discussion
The ANNs model (Rainfall-Runoff model) is a machine learning process for flow (discharge) evaluation using Microsoft Excel. The researcher has collected the real-time data series (1901 -2016) from India Meteorological Department (IMD), Pune, for the district Bankura. The data were taken from IMD station located six places and equally distributed in the entire district viz. i) Bankura, ii) Bankura (Central Water Commission), iii) Joypur, iv) Kangsabati dam, v) Ranibandh, vi) Indus. The data were taken at various slopes and various intensities. The studies have been considered for predicting runoff for five major climatic variables. The five input variables are Rainfall, Average Temperature, Cloud Cover, Potential Evapotranspiration, and Relative Humidity. The database used for ANNs model development is mentioned in Table 2. For ANNs model, the observed data were considered as 70% for training, 15% for testing, and 15% for validation. The behavior of the ANNs during training, testing, and validation are shown in Figure 9. The correlation coefficient value "r" above 97% indicates the strong positively correlated with each other (observed and predicted flow). The training process of the model network was stopped when the error of the data series was minimal. The objective of the training process is to reduce error based on model performance (NSE, RMSE, and MAE). The validation set is used to obtain the best result of the model network. The values of training "r" = 0.9961, testing "r" =0.9783, and validation "r" =0.9989, represents the good agreement with correlation coefficient values as shown in Figure 9. The data pairs are very closer to the diagonal line represent the excellent prediction. The hydrograph using ANNs model as shown in Figure 10 reveals that runoff data series is more efficient for the rising, equilibrium discharge, and recession limb. For events, model "C" and "D", the Nash-Sutcliffe Efficiency (NSE) greater than 95% (shown in Table 3) indicates the excellent results. The predicted result is useful in the field of water resources management and planning for decision-making (4) . Apart from that, the modeling can also be useful for rural as well as urban planners to take necessary measures. https://www.indjst.org/