Sentinel

2023-2024

ABSTRACT

This project addresses the critical need for efficient and real-time anomaly detection in surveillance systems, considering the widespread deployment of surveillance cameras in diverse environments. Traditional server-based approaches pose challenges related to cost, network strain, and responsiveness. Leveraging edge computing and deep learning, our project aims to develop a cost-effective and robust solution. The project's highlighting features include state-of-the-art accuracy, real-time inference and streaming, and seamless integration with existing networks. Our goal is to explore existing solutions and propose an architecture that optimizes model performance on edge devices while offering compatibility with both GPU and CPU environments. Project code is available at https://github.com/Sentinel-FYP

INTRODUCTION

Surveillance cameras have proliferated in today's world, appearing in homes and businesses, schools and universities, small shops and sprawling malls, and public places and restricted areas. They serve indispensable functions, such as enhancing home security, improving public safety, safeguarding educational environments, and preventing theft in commercial establishments [1]. However, the challenge lies in the effective monitoring of these camera feeds, as it is nearly impossible for humans to maintain constant vigilance. The risk of missing critical events gets high because surveillance footage often contains long periods of inactivity and routine events.

Deep learning has made significant progress in recognizing actions in videos in real-time, and it no longer requires expensive and powerful servers as it used to. Thanks to improvements in hardware and software, we can now use these deep learning models on everyday devices like mobile phones [2]. This development has opened exciting possibilities for enhancing surveillance and security, which our project aims to leverage fully.

Some companies provide automated anomaly detection for live camera feeds, but most of these solutions depend on running model inference on servers. This approach raises three significant concerns. Firstly, it can be costly for many users to use these servers, even as a subscription service. Secondly, this method necessitates a continuous stream of camera feed to the server, placing a substantial burden on the user's network and presenting scalability challenges as more cameras are integrated. Thirdly, this approach may not be practical in scenarios where real-time results are essential, and network reliability is unreliable.

Objectives

The primary goal of this project is to develop an advanced surveillance system that leverages deep learning and edge computing for real-time anomaly detection. The key objectives include:

1. Achieving state-of-the-art accuracy in anomaly detection on benchmark datasets.

2. Ensuring real-time model inference while minimizing power consumption.

3. Seamless integration with existing IP camera networks.

4. Providing push notifications for anomaly alerts on users' mobile devices.

5. Enabling remote, on-demand live streaming of camera feeds for user accessibility and convenience.

These objectives collectively aim to deliver an efficient, cost-effective, and user-friendly solution for enhanced surveillance and security across diverse settings.

Dataset Used

For anomaly detection, we used the UCF Crime dataset [3]. UCF crime is a large-scale video dataset. The dataset contains 1900 videos of both anomalous and normal activities. The percentage of both classes is 50%. The anomalous activities include events such as fighting, arson, burglary, arrest, abuse etc. The videos in the dataset are long and untrimmed. The dataset is weakly labeled as videos labelled as anomalous do not have anomalous activity in the whole video. Rather, the anomalous part occurs only in a small section of the video. To enable fully supervised training, we incorporated frame-level annotations from this work.

Model Architecture

For fine tuning, we used MoViNet A-0 stream model on the UCF Crime dataset.

MoViNets stream variants consume significantly low memory without compromising much accuracy. Figure 2. shows the memory vs accuracy comparison of MoViNet models with other state of the art models on the Kinetics-600 dataset. Figure 1. shows the same comparison with Floating Point Operations (FLOPs).

Figure 1. FLOPs vs Accuracy comaprison of MoViNet with state of the art models [10]

Figure 2 Memory vs Accuracy Comparison of MoViNets

MoViNets [5] are based on the MobileNet architecture, which is optimized for mobile devices. However, MoViNets extend the MobileNet architecture to support video analysis by adding temporal convolutions and other techniques that enable them to process video data efficiently and accurately.

MoViNets also use other techniques such as depthwise separable convolutions, bottleneck layers, and global average pooling, which help to reduce the number of parameters and computational requirements of the models.

Evaluation Results

Figure 3. shows the confusion matrix plot of the model on the test dataset. The model performs quite well on the test dataset while limiting both the False Positives and False Negatives to 17 and 12 respectively.

Figure 7. shows the ROC plot of the model on test dataset. The AUC of the curve is 89% which is quite close to highest reported AUC value on the UCF crime dataset which is 88% [6].

Figure 3. Confusion Matrix of model

Figure 4. ROC plot of the model

System Architecture

Fig 5. presents an overview of the complete system architecture as described below. Project development includes the following key components:

i) Edge Deployment

For edge deployment, Nvidia's Jetson Nano will be utilized. The edge device will handle tasks like processing multiple camera streams for monitoring anomaly, logging anomalies if found, recording camera footages if enabled and live streaming camera feeds on demand.

ii) Frontend

The mobile app will include functionalities to create user accounts and register edge devices on the server using unique identifiers (IDs). Initial setup with the edge device will require the mobile device to be in the same network as the edge device. After the setup, users can remotely access a live stream from the server through the app. Furthermore, the application will be equipped to deliver push notifications and alerts to users whenever the server detects anomalies.

iii) Backend

In the backend server, alerts will be generated when an anomaly is detected by the edge device and promptly notified to the server. This notification would include details like the date, time, and corresponding footage. The server then would generate a push notification to the user's mobile app, allowing them to access real-time flagged footage for review. To maintain a record of anomalies, the server would securely store logs with specific anomaly information. For remote live streaming, the server will act as an intermediary between the edge device and the user’s mobile device.

Figure 5. System Architecture

Tools and Technologies

Google Colab

Python

Javascript

TensorRT

Tensorflow

React Native

MongoDB

NodeJS

Project Resources

Sentinel FYP Presentation.pptx

Project Slides

FA23CS02_Sentinel_Smart_Surveillance_System_with_Automated_Anomaly_Detection.pdf

Project Thesis

Poster PDF.pdf

Poster

IMG_6237.MOV

Project Demo

The Team

Project Supervisor

Dr. Usama Ijaz Bajwa

Co-PI, Video Analytics lab, National Centre in Big Data and Cloud Computing,HEC Approved PhD Supervisor,Tenured Associate ProfessorDepartment of Computer Science,COMSATS University Islamabad, Lahore Campus, Pakistanwww.usamaijaz.comwww.fit.edu.pkJob ProfileGoogle Scholar ProfileLinkedIn Profile

Hammad Ali

Email: hammad.a22002@gmail.com

BSCS

(Computer Science, COMSATS Lahore)

Linkedin Profile

Github Profile

Muhammad Omar Sarfraz

Email: omar786089@gmail.com

BSCS

(Computer Science, COMSATS Lahore)

Linkedin Profile

Github Profile

Muhammad Ibrahim

Email: mibrahim37612@gmail.com

BSCS

(Computer Science, COMSATS Lahore)

Linkedin Profile

Github Profile

References

[1] P. Eric, B. Welsh, D. Farrington and A. Thomas, “CCTV surveillance for crime prevention: A 40-year systematic review with meta-analysis,” Criminol. Public Policy, vol. 18, no. 1, pp. 135-159, 2019.

[2] Goel, Abhinav, C. Tung, Y.-H. Lu and G. K. Thiruvathukal, “A survey of methods for low-power deep learning and computer vision,” in IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, 2020.

[3] W. Sultani, C. Chen and M. Shah, “Real-world anomaly detection in surveillance videos,” in IEEE conference on computer vision and pattern recognition, 2018.

[4] K. Dan, Y. Liangzhe, Y. Li, Z. Li, B. Matthew and G. Boqing, “MoViNets: Mobile Video Networks for Efficient Video Recognition,” arXiv preprint arXiv:2103.11511, 2021.

[5] https://paperswithcode.com/sota/anomaly-detection-in-surveillance-videos-on

Page updated

Google Sites

Report abuse