Model which gives best result is tested with a scenario where alert level is triggered. This event is happened 2 weeks after the last day where data is used to train and test the model initially. Result showed that although model was able to capture the trends during the training period, It is unable to generalize.
We try to reduce model complexity and deal with overfitting. Accordingly, we found out that model with 4 hidden layers and 8 neurons provide best result in forecasting this event.
Following diagrams shows the variation of rmse with no of neurons in the network for NN created using 4 hidden layers.
Following diagram show the result obtain for the event using simpler model.