In this activity we were asked to model a dataset with the continuous random variable and verify it. Hence for this task, I have selected the rainfall data for all over India from 1871 to 2016.
In the above spreadsheet, there are three pages. In the first one I have shared the larger dataset, i.e., year wise monthly, seasonal and annual rainfall data.
In the second page, I have done the necessary calculations to fit a probability distribution. In this case, I have chosen Normal distribution. To understand the calculation, please see the document below where I have explained it in detail.
Conclusion and Alternative model:
In the last page of the spreadsheet, I have evaluated whether the estimated probabilities match with the calculated ones. You can see that both sets match up quite nicely.
I have also checked whether the dataset fits another continuous probability distribution or not. I have chosen Exponential distribution.
Clearly we can see that while the dataset has a nice fit with Normal distribution, it does not fit at all with Exponential distribution.
So we can safely assume that in this time span, annual rainfall of a year is following Normal distribution with mean 10859 and standard deviation 1010.2