1 Layer CNN vs VGG-16 CNN
As you can see above, VGG-16 had an accuracy of 94.050% while the 1 layer CNN only had 72.80% accuracy. We will now discuss why these accuracies varied so greatly.
We used 25 epochs for VGG-16 and only 20 for the 1 layer CNN. Epochs are the number of full passes of our training data through the model. The greater the number of passes, the better trained the model, which leads to higher accuracy.
While both models followed the VGG architecture, the VGG-16 model was pre-trained on the ResNet database and the 1 layer CNN had no previous training. The ResNet database is much larger than our current database and contains 1,000 different classes including cats and dogs.
VGG-16 had 15 more layers than the 1 layer CNN. Adding layers increases the number of weights in the model and so the VGG-16 model had more weights that were able to be trained and therefore can be finetuned to better predict dogs vs cats.
Each layer of a CNN has neurons with the same weight and biases, therefore each layer of the CNN does a very good job in predicting one feature. Therefore the VGG-16 could predict 15 more "features" that distinguished between cats and dogs compared to the 1-layer CNN.
CNN vs Decision Tree & KNN
As you can see, both CNNs had significantly better results than the KNN and Decision tree models with and without edges. We will now explain why this is the case.
CNNs extract features from the raw images while KNN/Decision tree models use the raw image data. This allows CNNs to perform better than KNN/Decision tree models which need to save all the training data they've seen and use this data to predict a test image. If a KNN/Decision tree model has never seen a certain angle of a dog in its training dataset, it may be more likely to mispredict this dog. However, CNNs may be more likely to predict it correctly if they are able to extract certain features and find the features match dogs best.
CNNs can have an arbitrary number of layers and therefore arbitrary number of weights that can be trained. This makes CNNs able to create more complex models if needed- but you need to have a lot of training data and more time than KNNs/Decision tree models.
CNNs are constantly improving as they train - updating weights after each epoch, while KNN/Decision trees do not. KNN is a greedy algorithm that doesn't perform backtracking after gaining new information and Decision Tree is a divide conquer method of non-overlapping subproblems. Both of which are non-adaptive, compared to that of the CNN. As a result, once a KNN/Decision Tree model makes a "decision" it can't change it while, the CNN can with a new pass.
KNN with edge extraction did not perform as well as the CNN (which also extracts features) because our edge detection method used a pre-determined medium kernel size so we were not able to detect edges of cats/dogs in certain instances (examples shown in our "Feature Extraction" page).
Decision Tree & KNN With vs Without Edges
The KNN with edge detection performed 6% better than the KNN without edge detection. Meanwhile, the decision tree model did not show any improvement with edge detection. Altogether, this shows that the greedy heuristic of the KNN is much better at categorizing similar features compared to the Decision Tree model (Please visit the KNN & Decision Trees page for further explanation).