Due to the unbalanced nature of the data lower class label count were over sampled in a separate model.
The oversampling resulted in fewer correct predictions for class label 5, but the model showed better percentage results for class 4 by a significant amount.
Confusion Matrix with Oversampled Data
Confusion Matrix without Oversampled Data