We used the Extended Cohn-Kanade Dataset to train our Emotion Detector. The dataset consists of scenes of different subjects showing different emotions.
The OpenFace toolkit was used on the CK+ dataset to extract the Facial Action Units from every frame. OpenFace is a lightweight Facial Feature Extraction tool that also has real-time capabilities, and was hence, chosen for DeepTrack.
These Action Units represent different configurations of the same facial features, across many different images. For example, ActionUnit1 (AU1) corresponds to inner brow raising. The extracted Action Units are then used as data to train the required Machine Learning models.
Each frame was treated as an individual data point in a large dataset. The following were the results using different Machine Learning models.
Entire scenes were taken as independent data points, and sequential machine learning models were used.
Here a dilated TCN [2][3] architecture is used with number of filters is 8, number of stacks/blocks is 3 and dropout rate is 0.4.
Below are the validation and training accuracy and loss plots.
Testing Accuracy:
The novel application of combining emotion detection data with tracking has been found to be fairly new. With respect to tracking, we find that DeepCC algorithm enables person tracking and detection at a reliable speed and accuracy without incurring large resource complexity.
The next stage is to correlate the trajectory data thus obtained with the emotion detection data for each person identified in the cameras. This will be done with the aim to come up with insightful inferences that can enable super-markets to build smart strategies for product placement.