FACE RECOGNITION is an exciting technology as it helps in securing/identifying different things like- smartphones, labs, apps, etc.

A facial recognition system is a technology capable of matching a human face from a digital image or a video frame against a database of faces. It compares the information with a database of known faces to find a match. That’s because facial recognition has all kinds of commercial applications. It can be used for everything, from surveillance to marketing.

Facial Recognition is totally based on A.I. (Artificial Intelligence) and M.L. (Machine Learning) Algorithms.

But, have you ever thought — How facial recognition system recognises the exact face (or sometimes very similar kind of face).

I am sure you have thought of it, so let me help you with it.

This system uses image analytics technique. I have prepared a model using Orange — Data Mining Tool.


Orange is a visual programming software package used for this domain. It has used widely ranging from machine learning, data mining, and data analysis, etc. Orange tools (called widgets) are within the realm of simple data visualization & pre-processing empirical evaluation of learning algorithms and predictive modelling. Visual programming is implemented via a combination in which workflows are designed by linking user-designed widgets.

At the same time, proficient users can use Orange as a Python library to manipulate data and alter widgets.

This is known as Clustering Model.

Firstly, images of different animals were downloaded and saved in a folder.


After downloading, import images tool has been used to load the images from the system.

Import Images Tool

After importing, the image viewer tool has been used as it can display images from a data set, which are stored locally or on the internet. It can be used for image comparison while looking for similarities or discrepancies between selected data.

Image Viewer Tool

Image Embedding, too, was used:

Image Embedding Tool

As it helps in reading the images and uploads them to a remote server or evaluates them locally. Deep learning models are used to calculate a feature vector for each image. It returns an enhanced data table with additional columns.


Images: List of images.


Embeddings: Images represented with a vector of numbers.

Skipped Images: List of images where embeddings were not calculated.

After Image Embedding, the distances tool was used, taken from unsupervised widgets:

Distances Tool

As it computes distances between rows or columns in a dataset. By default, the data will be normalized to ensure equal treatment of individual features. Normalization is always done column-wise.


Data: input dataset


Distances: distance matrix

After Distances tool, Hierarchal Clustering tool was used, taken from unsupervised widgets.

Hierarchical Clustering Tool

The widget computes hierarchical clustering of arbitrary types of objects from a matrix of distances and shows a corresponding dendrogram.


Distances: distance matrix


Selected Data: instances selected from the plot

Data: data with an additional column showing whether an instance is selected.

Hierarchical Clustering has two advantages :

1) It helps in differentiating between the various classes of objects.

2) In case of any modelling error, the user can manually rectify it since no model is 100% correct.

In this Hierarchical Tool, it can be seen that horse and zebra were placed together — hence it can be rectified manually.

Given below is the complete Clustering model view.

Full Clustering Model View

There can be one more case,

In this case, different pictures can have different folders, like- four or five images of dogs are in a single folder named Dog Images, etc.

For this case, Image analytics technique will be used to check the error rate so that the error can be rectified.

This is known as Classification Model.

In this model, separate images have different folders.

E.g., All separate images of Dogs will have separate category same for all the images of Cats.

Classification Folder

Firstly, images of different animals were downloaded and saved in different folders.

As shown above, the whole process was done again till Hierarchical Clustering

Hierarchical clustering for classification model

After Hierarchical Clustering, the data was sent from image embedding to MDS

Multidimensional scaling is a technique which finds a low-dimensional projection of points, where it tries to fit distances between points as well as possible. The perfect fit is typically impossible to obtain since the data is high-dimensional or the distances are not Euclidean.

MDS Tool

In the input, the widget needs either a dataset or a matrix of distances.


Data: input dataset

Distances: distance matrix

Data Subset: a subset of instances


Selected Data: instances selected from the plot

Data: dataset with MDS coordinates

Image grid was used to view the images.

A plethora of different varied models was used to ascertain the models’ accuracy using test and score evaluator.

Different models were used to ascertain the accuracy

Different Models are:

– Support Vector Machine (SVM) model was used because

SVM has a regularization feature. So, it has good generalization capabilities, which prevent it from over-fitting, and it can also be used to solve both categorical and numerical problems. A small change to the Data does not significantly affect the SVM. So, the SVM model is stable.

– Artificial Neural Network (ANN) model was used because

ANN is like our brain; millions and billions of cells — called neurons, which processes information in the form of electric signals. Similarly, in ANN, the network structure has an input layer, a hidden layer, and the output layer. It is also called Multi-Layer Perceptron as it has multiple layers. The hidden layer is known as a “distillation layer” that distils some critical patterns from the data/information and passes it onto the next layer. It then makes the network quicker and more productive by distinguishing the data from the data sources, leaving out the excess data.

  • It captures a non-linear relationship between the inputs.

  • It helps in converting the information/data into more useful insight.

– Random Forest Model was used because:

Random Forest is a tree-based learning algorithm with the power to form accurate decisions as it many decision trees together. As its name says — it’s a forest of trees. Hence, Random Forest takes more training time than a single decision tree. Each branch and leaf within the decision tree works on the random features to predict the output. Then this algorithm combines all the predictions of individual decision trees to generate the final prediction, and it can also deal with the missing values.

– Decision Tree was used because:

is used to comprehend & predict both numerical values and categorical value problems. But there is a drawback that it generally results in overfitting of the data/information. Yet, we can dodge the over fittings by utilizing a pre-pruning approach, for instance, creating a tree with fewer leaves and branches.

– k-nearest neighbours (KNN) was used because:

The kNN widget uses the kNN algorithm that searches fork closest training examples in feature space and uses their average as a prediction.

– Stochastic Gradient Descent (SGD) was used because:

The Stochastic Gradient Descent minimizes a chosen loss function with a linear function. The algorithm approximates a true gradient by considering one sample at a time and simultaneously updates the model based on the gradient of the loss function. For regression, it returns predictors as minimizers of the sum, i.e. M-estimators, and is especially useful for large-scale and sparse datasets.

–Constant Model was used because:

This learner produces a model that always predicts the majority for classification tasks and mean value for regression tasks.

For classification, when predicting the class value with Predictions, the widget will return relative frequencies of the classes in the training set.

For regression, it learns the mean of the class variable and returns a predictor with the same mean value.

In the end, it can be said that SGD Model is the best model for predicting as it has the highest Area Under Curve (AUC).

AUC is scale-invariant — i.e., — it measures how well the predictions are ranked.

AUC is also a classification threshold invariant — i.e., — it measures how well the quality of the model’s prediction is.

AUC values

Full Classification Model View

In this way, Image Analytics works.


In case you have any questions or any suggestions on what my next article should be about, please leave a comment below or mail me at

If you want to keep updated with my latest articles and projects, keep visiting the website ^_^.

Connect with me via: