Recap
This research is a continuation of work carried out during the 7th Semester. In 7 th Semester, water level of Kalu ganga is predicted using data set of water level of Kalu ganga and rainfall value of 7 rainfall guages. ( Later, we contacted the Irrigation department and with the collaboration of Eng. Aruna , we were able to obtain different data sets. more on that later in the portfolio )
We carried out further analysis on the same data set used in the previous semester. For the reference, below diagram show sample of that data set. Here, rainfall values are 24-hour rainfall value in mm and water level values are in meters
By considering, cross-correlation of water level of kalu ganga and rainfall data, it is found that Lag 1 rainfall data is most correlated to current water level of Kalu ganga.
Basically what we tried to do was to predict tomorrow water level using todays readings of the rain gauges. We used a linear regression model as the first. The model weights are as follows. Here S1(k) to S6(k) are the rainfall sensors and the W(k) represents the water level of Kalu ganga.
At first, the model was created using all the data and considering all 7 rainfall guages values as independent variable. Optimizations that were considered,
What is the minimum number of data points required to have a successful model with bearable error margin?
Effect of removing less correlated rainfall sensors on the performance of the prediction of water level.
The model weights of the model made using only 110 days of data are selected as optimal number of data points and the weights for model created using 110 days data are shown below.
Some of the findings of removing sensors are as follows
Performance with all the sensors
Root mean square error is 26.9271
Performance without sensor three
Root mean square error is 25.1046
Performance without sensor three and five
Root mean square error is 25.0947
Performance without sensor three, five and one
Root mean square error is 26.9169
Performance without sensor three, five and two
Root mean square error is 25.2016
This study implies that even if we dont get any measurements from sensor three and five due to any sort of a problem, it doesn't effect the result that much.