Classifying Dogs vs Cats - Gray Pixel Feature Extraction

Gray Scale Pixel Values as Features

The first type of feature extraction that we explored was using grayscale pixel values as features. We used the python CV2 library to convert each image into the gray scale and then used PCA and LDA to determine if the feature showed significant distinction between cats and dogs.

Principal Component Analysis (PCA)

PCA is used to reduce the dimensionality of the dataset to find the main contributing features. Dimensionality is reduced by removing features with low variability and only keeping those with the highest variability that contribute the most towards classification decisions. PCA is conducted by first vectorizing each image and then stacking them into a single matrix. The first half of the matrix contains cat images and the second half of the matrix contains dog images. We then perform the PCA fit transform which projects the dataset onto a new bias/new dimension to see if there are significant differences between the classes.

Here we have performed a 2-Component PCA on cat and dog data sets of size 5000 each using grayscale pixel values as features. In the graph, we can see that the dog data set is directly on top of the cat data set. This shows that the gray pixel features are not significant in distinguishing the classes.

Here is a link to our code:

PCA-GrayPixel

That being said, a short-fall to PCA is that it is unsupervised and doesn't consider the label of a datapoint when reducing the dimensionality to find maximum variation. Therefore, it can project the data onto a new set of biases where it doesn't distinguish by label (for example, cat or dog). Therefore we need to also perform LDA to check if these results are valid. LDA is supervised and focuses on finding a subspace that maximizes the variability between the different classes. This can be demonstrated in the graph on the left.

Source for Image: Chandan Durgia & Prasun Biswas, Towards Data Science

Linear Discriminant Analysis (LDA)

LDA is used to reduce the dimensionality of the dataset to find the main contributing features based on the classes. Dimensionality is reduced by removing features with low variability between classes and only keeping those with the highest variability that contribute the most towards classification decisions. LDA is conducted by first vectorizing each image and then stacking them into a single matrix. The first half of the matrix contains cat images and the second half of the matrix contains dog images. We then perform the LDA fit transform which projects the dataset onto a new bias/new dimension to see if there are significant differences between the classes.

Below, we have graphs after performing LDA on different sized data sets.

Here is a link to our code:

LDA-GrayPixel

As seen in the figures above, as the data set size increases using grayscale pixels as features becomes less of a distinguishing factor between cats and dogs. There are several reasons as to why this might occur.

Since we only have 2 classes, LDA can only be done on 1-dimension or using 1-component. As a result, we can only plot the classes on a line. This can cause it to be harder to see the differences between very large data sets.
As the datasets become large big discrepancies between classes can become less obvious. For example, with smaller data sets not all images have a background with a lot of noise (other objects, multiple animals, humans, grass, furniture, etc.). Therefore, LDA can consider those different pixels as outliers and still find differences between cats and dogs. However, as the data set grows, noisy backgrounds become a huge part of the data analyzed and aren't considered outliers. In other words, both the cat and dog data set will contain roughly the same amount of background noise. Therefore, when considering each individual pixel, it will be too difficult to differentiate between the type of animal. Thus, for larger data sets, the gray-scale pixel acts similar to the RGB pixel and cannot be considered a significant factor in distinguishing between cats and dogs.

Altogether, both of these reasons play some role in why gray scales pixels become less distinguishing between cats and dogs when the data set increases. Perhaps, if we were able to perform object extraction and remove the background completely, then we could see differences in gray-scale pixels for the different data sets. However, in that case, we might as well use the object as a feature instead to save computational memory, power, and time. That being said, object extraction is beyond the scope of this project but can be considered in future work.

Images from dataset: https://www.kaggle.com/c/dogs-vs-cats/data

Above, are some images from our dataset with noisy backgrounds. It is important to note that most people do not take "headshots" of their cats and dogs with solid backgrounds. Therefore, it is difficult to find a big enough, data set with just those types of images. Furthermore, that type of datset wouldn't be realistic.

Note: The maximum images we could pass into PCA was 5000 images each for the cat and dog data sets, before running into RAM issues on Google Colab. The maximum images we could pass into the LDA was 1200 images each of the cat and dog data sets.

Page updated

Report abuse