In order for our model to be accurate, a large number of data is needed. We got the data from a website called Food-101. Our dataset contain around 12000 labeled images from 101 categories. The dataset was separated into a training set and a test set.
Training Set: A set of examples used to train the computer.
Test Set: Is used to evaluate how well the model does with data outside the training set.
We first extracted the images and labels from a csv file and organized them into arrays called "im_data" and "im_label". Then, we used a for loop to iterate through the arrays at the same time. Since the images and labels are aligned, we are then able to print them out on the 5 by 5 grid.
Features various classification, clustering algorithms and regressions.
Accuracy: Is number of correct predictions, divided by the total number of predictions. The higher the accuracy score, the better the software performs.
Precision: Is how likely the model is to predict true as the outcome.
Recall: Is how well the model is able to remember what it was trained.
Logistic Regression
The first of the various classifiers we experimented was logistic regression. Logistic regression is a statistical model that can be used to deal with situations in which the observed outcome for have only two possible types, "0" and "1".
Precision: 0.02
Recall: 0.02
F1 score: 0.02
K Nearest Neighbors
K nearest neighbors can be used to find the images most similar to the test photo. The parameters put into the function is the number of nearest neighbors for which the model tries to look. In this case, we didn't input a parameter so the default is 5.
Precision: 0.01
Recall: 0.02
F1-score: 0.01
Naive Bayes
Naive Bayes are a family classifiers based on applying Bayes theorem. They are a type of probabilistic classifiers that assume strong independence between features.
Precision: 0.04
Recall: 0.06
F1-score: 0.04
RandomForest
Random forest operates by constructing a multitude of decision trees. It combines the the decision trees together to get a more stable and accurate result.
Precision: 0.01
Recall: 0.01
f1-score: 0.01
The 64x64 image means it is 64 pixels high, 64 pixels wide.
If the image is 64x64x1, then it is a grey; if the image is 64x64x3, then it contains RGB which is coloured.
Is a class of artificial networks which is mostly used in visualizing images. It uses a 3x3 filter to split up the grid, then combining the neighbor pixels together to reduce the numbers of parameters.
Split up by 2x2 grids, take the important values, which is the highest value in the 2x2 grid.
For access to code for this project, please visit here.