Classification Models

After extracting and reducing features, we used them to train classification models. We identified the best feature + classifiers combination by testing the accuracy of their predictions.

Classification is a type of supervised learning problem where we want to determine what class each example belongs to.

Here are some of the models we used:

Logistic Regression

Logistic regression transforms its input, using the logistic sigmoid function, to a probability value. It takes any real number and turns it into a value between 0 and 1. Input values (X₁, X₂, …,X_n) are the features. The features are combined linearly using weights/coefficient values (β₁, β₂, …,β_n) plus the bias or intercept term β₀ to obtain the formula:

t = β₀ +β₁X₁+ β₂X₂+…β_nX_n

t is then used to calculate the probability of an element being classified as 1 given the observed features, using:

σ(t) = 1 / 1 + e^-t.

K Nearest Neighbours (KNN)

KNN relies on the assumption that similar examples should be classified the same. Examples are represented as points on a graph, and similarity is dependent on the distance to other examples in the graph. To put it simply, it gives the new example the same classification as the "closest" training sample.

Random Forest

Random Forest is built on top of many Decision Trees (classifies via yes/no questions) with different subsets of features and/or training sets. The predictions of each decision tree are saved, and the final predicted values are based on what the majority of the decision trees voted.

Support Vector Machine (SVM)

The SVM tries to build the best hyperplane splitting the data points from different variables. It can also classify non-linear data by using kernels, which add an extra dimension. In our code, it is called Support Vector Classifier (SVC).

We also used a Convolutional Neural Network, or CNN.

A neural network is composed of a large number of interconnected neurons that work in unison to solve specific problems. A CNN is a neural network whose entries are images that allow us to encode certain properties in the architecture to recognize specific elements in the images. They process images as tensors, which are matrices of numbers with additional dimensions.

This is how we explored the models. Here is our code with explanation.

Page updated

Google Sites

Report abuse