Action recognition

Fight Recognition in prisons, Baptiste Hanssens.

The purpose of this project is to detect fights in prisons. It was ordered by the Correctional Service of Canada (CSC). The objective is to deploy this system in one of the CSC penitentiary establishment and evaluate its performance when used in a real context. The selected action category to be detected will be "People fighting" although other actions could also be selected. The prototypal system will be developed and tested at the VIVA lab of the University of Ottawa and deployed on the field during the final phase of the project. Performance of the system will be evaluated on the basis of the number of correctly detected occurring actions, the number of missed actions and the number of false alarms generated by the system.

The project is divided in two part, the first one is about making a classifier which has to learn the difference between fight and non-fight video sequences. The second one is about predicting if a new video sequence displays fight or not. This field of engineering is known as Machine Learning.

If we want the classifier to learn how to detect fight/non-fight actions, we first need two datasets. One dataset showing sequences of fight and another one showing sequences of normal activity. That's why I had to browse the web to find those kinds of dataset:

Hollywood 2: Different fights actions taken from movies.
http://www.di.ens.fr/~laptev/actions/hollywood2/
UT-Interaction: Dataset of 20 videos where several persons kick, punch and push each other.
http://cvrc.ece.utexas.edu/SDHA2010/Human_Interaction.html
UCF50: There is a category “Punch” with 160 videos of boxing.
http://crcv.ucf.edu/data/UCF50.php
Hockey Fights: 500 videos of fighting and 500 videos of non-fighting videos. The sequences last 1 second. The whole dataset happens in a hockey field.
http://visilab.etsii.uclm.es/personas/oscar/FightDetection/index.html
- Paper: E. Bermejo, O. Deniz, G. Bueno, R. Sukthankar. Violence Detection in Video using Computer Vision Techniques Proceedings of Computer Analysis of Images and Patterns, 2011.
INO: Only one video where several persons fight on a parking lot.
http://www.ino.ca/en/examples/video-analytics-dataset/

The next step of this project is to extract features from this set of sequences and find relevant action information that will allow us to detect the same pattern in new videos. To do so, we can use features such as DoG (Difference of Gaussians), HOG (histograms of oriented gradients), HOF (histograms of optical flow), etc... All these features will be put together to form what we call a feature vector and which characterizes the action in the sequence.

To extract these features in order to obtain the feature vectors, I use Feng Shi's code, a PhD researcher of ViVa Lab.

http://www.site.uottawa.ca/~fshi098/

The code will provide me with one file by video sequence and which represents its feature vector. All these feature vectors will allow the classifier to make the difference between fight and non-fight videos. I implemented this classifier in C++ and I used the SVM classifier with the RBF kernel.

The second part of this project is to enter new videos into the classifier and to predict if there is fight or not.

The challenge of this project and the ultimate goal is to detect fight in real time.