Anomaly Detection and Localization in Low Quality Surveillance Videos
Gulshan Saleem is doing a Ph.D. in Computer Science (2019-till date) from COMSATS University Islamabad (Lahore Campus), Pakistan and her specialized area of research is Artificial Intelligence. She has completed her MS in Software Engineering from the National University of Science & Technology, CEME College, Rawalpindi, Pakistan . She received her BSE in Software Engineering degree from Fatima Jinnah Women University, Rawalpindi, Pakistan.
She has worked in multiple domains, which include computer networks, Natural Language Processing, Text Analysis, web services, image encryption, and Image Identification. Later on, she pursued her Ph.D. degree and opted for video analytics as an area of interest. Now she is working on object detection, tracking, human action recognition, and anomaly recognition and detection. She has published several research articles on the aforementioned topics in well-reputed journals.
Publications and Ongoing Projects
Towards Human Activity Recognition: A Survey- Published in Neural Computing and Applications (NCAA), Springer Nature and under Minor Revision
Efficient Anomaly Recognition Using Surveillance Videos- Published in PeerJ Computer Science Journal and under minor revision
SurveillanceNet: Spatio-Temporal Anomaly Identification in Surveillance Videos using Two-Stream CNN and LSTM- Submitted in Multimedia Tools and Applications, Springer
Anomaly Identification Using Surveillance Videos-In Submission Process
A Robust Deep Networks based Multi-Object Multi-Camera Tracking System for City Scale Traffic-Published in Multimedia Tools and Applications, Springer
Highlights From Ongoing Projects
Introduction
Surveillance is the monitoring of behavior, activities, or information for the purpose of influencing, managing, or directing. The activity assessment process is complex and cannot be done at once but by performing the necessary breakdown, the task becomes easy. Unintentionally, some researchers confuse the term activity with the terms action, gesture, motion, and interaction, but in reality they can be part of activity but are not equal in definition.
Anomaly detection aims to identify unusual patterns, anomalies, or data points that do not conform to the expected distribution. Applications of anomaly detection include security surveillance, fraud detection in financial transactions, fault detection in manufacturing, intrusion detection in a computer network, monitoring sensor readings in an aircraft, spotting potential risk or medical problems in health data, and predictive maintenance. Today, due to cheap technology, we have large number of videos i.e. surveillance videos which are the main source to capture real time abnormal activities but for automatic detection of different anomalies, we need to design a system that took such surveillance videos as raw input and then produce a useful output.
Challenges
The output means the specific anomaly which needs public and authoritative attention so that such events can be minimized or at least properly dealt with on time. So by using the surveillance videos and incorporating deep learning solutions with the data, we may be able to assist humans to detect anomalies at the spot. Following are the objectives of this work:
Fully Automation of Surveillance System
Low quality video problem
Representataive Dataset
Reduced false positive rate
Anomaly Analysis
In data analysis, anomaly detection is the identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data. Typically the anomalous items will translate to some kind of problem such as bank fraud, a structural defect, medical problems, or errors in a text. Anomalies are also referred to as outliers, novelties, noise, deviations and exceptions.
In particular, in the context of abuse and network intrusion detection, the interesting objects are often not rare objects, but unexpected bursts in activity. This pattern does not adhere to the common statistical definition of an outlier as a rare object, and many outlier detection methods (in particular unsupervised methods) will fail on such data unless it has been aggregated appropriately. Instead, a cluster analysis algorithm may be able to detect the microclusters formed by these patterns.
Three broad categories of anomaly detection techniques exist. Unsupervised anomaly detection techniques detect anomalies in an unlabeled test data set under the assumption that the majority of the instances in the data set are normal by looking for instances that seem to fit least to the remainder of the data set. Supervised anomaly detection techniques require a data set that has been labeled as "normal" and "abnormal" and involves training a classifier (the key difference to many other statistical classification problems is the inherent unbalanced nature of outlier detection). Semi-supervised anomaly detection techniques construct a model representing normal behavior from a given normal training data set, and then test the likelihood of a test instance to be generated by the utilized model.
Application
Indoor Surveillance
Outdoor Surveillance
Office Monitoring
Public Safety & Security
Intelligence Departments
Action Data set
Even though, there were several bench mark data sets available to test an anomaly detector, the better choice would be about the appropriateness of the data and also whether the data is recent enough to imitate the characteristics of today network traffic. Useful Data sets for activity recognition are:
UCF-101 (Download )
Sports 1M (Download )
UCF-Crime (Download )
Weizmann Dataset (Download )
KTH Dataset (Download )
ASLAN (Download)
HMDB-51 (Download)
Charades (Download)
ActivityNEt (Download)
Kinetics (Download)
The 20BN-something-something (Download)
HACS (Download)
VLOGs (Download)
Moments in Time (Download)
FCVID (Download)
Hollywood 2 (Download)
Methodology
Code for the methodology is available at: Anomaly Recognition
Plots
Temporal Segment
Results
We attempted to address the issue of the high resource requirement of the anomaly recognition method and proposed a lightweight, resource-efficient real-time streaming TAR framework that can be embedded on a simple machine like a central processing unit (CPU).
We proposed to use temporal learning via partial shift operation to improve spatiotemporal feature based learning. It enables frames to share their learning among adjacent frames and reduce the cost of processing. Moreover, it helps in building feature maps based on high activity areas that support the classification task of anomaly recognition.
Our framework is capable of performing online anomaly recognition and it allows six simultaneous screens on a CPU-based system while using fewer parameters (2.2M), FLOPs (0.564GFLOPs), model size (0.6Mb) and low latency overhead which proves it to be resource efficient approach.
Our model achieves 7.87% and 2.47% increased state-of-the-art accuracy with ResNet-50 and MobileNetV2 respectively on UCF-Crime dataset.
Video representation in terms of multidimensional array
Video input is forwarded to temporal feature extractor which performs temporal and spatial modeling with the help of MobileNetV2 whereas fully connected layers perform anomaly recognition via spatiotemporal modelling
Proposed temporal based anomaly recognizer (TAR) framework with 2D MobileNetV2 baseline architecture
Accuracy of proposed framework
Loss Curve of proposed framework
Multiclass confusion matrix of temporal based anomaly recognizer (TAR)
Discussion
Automated surveillance is popular area which is continuously improving over the time. Such systems are designed to process huge amount of data which requires a lot of resources which is a challenging requirement. In practical scenarios, time, and cost both are critical to handle and hence a system needs a lot of computation time and resources to serve the purpose. Therefore, this research aimed at addressing these problems through providing a resource-efficient, high-performing system for anomaly recognition. Increased numbers of CCTV generates vast amount of unlabelled video data and its labeling is a difficult task. Moreover, large amount of data requires substantial computational resources to process it. This study provides a lightweight and cost-effective approach for anomaly recognition in terms of memory consumption, processing parameters, and computation time that is essential for a low-resource systems, such as CPU-based systems. The proposed framework is based on 2D convolutional architecture (2D CNN), with a spatiotemporal feature extractor which functions as a partial shift that learns and distributes temporal information among its neighborhood frames. MobileNetV2 baseline performs spatial feature extraction, which is then combined with temporal learning to perform anomaly recognition. Our proposed framework works with low latency rate of 12.01 ms which makes it effective for performing online video recognition and handle up to six streams at once. On the UCF Crime dataset, the proposed framework achieves an accuracy of 88% for binary anomaly recognition problem with time complexity of 0.198 s. On the UCF Crime2Local dataset, the proposed framework achieves accuracy of 52.7 percent for a multi-class problem. The model outperforms previous models in terms of computational parameters requiring 2.2 M parameters and 0.564 GFLOPs with MobileNetV2 as the baseline architecture. Moreover, our proposed framework has achieved an increased accuracy of 2.47% on UCF Crime dataset with reduced computational requirement. Overall, it performs well in a lot of aspects and can be used for realtime recognition but it can be further improved. Some limitations of this study are highlighted in below section to consider in future.
Limitations and Future Directions
We have proposed a resource-efficient anomaly recognition system that effectively performs recognition tasks, but it is not evaluated for object level detection and tracking. Object detection and tracking can significantly improve security surveillance. We believe that with minor modifications to the current model, it could be useful for detection and tracking as well. Our model performs recognition with adequate speed efficiency, but its early response efficiency can be investigated further. It requires validating a model’s ability to perform recognition on a small sample of incoming frames as soon as possible to improve system’s response time. Surveillance systems are designed to perform anomaly recognition as precisely as possible, but there is an additional issue posed by the false positive rate, which can compromise system reliability. To increase the usefulness of our model, we will strive to reduce the false positive rate as much as possible.
Related Projects
Surveillance cameras have become widespread. Many videos are recorded for watch-over and monitoring purposes, increasing security.
The increase in the number of videos and their lengths has a troublesome aspect. The amount of information contained in videos has rapidly grown, making it increasingly difficult for people to find what they should pay attention to. Ricoh has developed a technology to extract unusual things and behaviors from videos.
Here, weakly labeled anomaly videos for training are used .
Useful Reads
Gulshan Saleem, Usama Ijaz Bajwa, Rana Hammad Raza, Fayez Hussain Alqahtani, Amr Tolba, Feng Xia (2022). "Efficient anomaly recognition using surveillance videos". PeerJ Computer Science, 8:e1117 http://doi.org/10.7717/peerj-cs.1117.
Gulshan Saleem, Usama Ijaz Bajwa, Rana Hammad Raza, "Towards Human Activity Recognition: A Survey", in Neural Computing Applications, Oct 2022.
Maqsood, R., Bajwa, U. I., Saleem, G., Raza, R. H., & Anwar, M. W. (2021). Anomaly recognition from surveillance videos using 3D convolution neural network. Multimedia Tools and Applications, 80(12), 18693-18716.
Mabrouk, A. B., & Zagrouba, E. (2018). Abnormal behavior recognition for intelligent video surveillance systems: A review. Expert Systems with Applications, 91, 480-491.
Tripathi, R. K., Jalal, A. S., & Agrawal, S. C. (2018). Suspicious human activity recognition: a review. Artificial Intelligence Review, 50(2), 283-339.
Sreenu, G. and M. A. Saleem Durai (2019). "Intelligent video surveillance: a review through deep learning techniques for crowd analysis." Journal of Big Data 6(1): 48.
Collaborators
Feng Xia
Associate Professor, Data Science,
Federation University, Australia