Vision-based detection of road accidents using traffic surveillance video is a highly desirable but challenging task. However, the unavailability of benchmark dataset of real traffic videos is the major bottleneck in doing research. We collected a dataset of real accident videos from the CCTV surveillance network of Hyderabad City in India. Video clips collected from City surveillance network are captured at 30 frames per second. Each video clip starts few minutes before the incident of an accident and contains several minutes after the incident. First few minutes of video which contains normal situation can be used for training the model and remaining for testing.
Below are few video clips and their frames from the dataset showing the accident.
You can download the datasets from the following link
https://www.iith.ac.in/vigil/resources.html
Or write to cs14resch1100 [at] iith [dot] ac [dot] in
Please cite the below paper if you used this dataset.
Dinesh Singh and C. Krishna Mohan, "Deep Spatio-Temporal Representation for Detection of Road Accident using Stacked Autoencoder," IEEE Transactions on Intelligent Transportation Systems (T-ITS), May 2018. doi: 10.1109/TITS.2018.2835308.
Bibtex
@Article{Singh2018tits,
Title = {{Deep Spatio-Temporal Representation for Detection of Road Accident using Stacked Autoencoder}},
Author = {Dinesh Singh and C. Krishna Mohan},
Journal = {{IEEE Transactions on Intelligent Transportation Systems (T-ITS)}},
Year = {2018},
Number = {},
Pages = {},
Volume = {}
DOI = {10.1109/TITS.2018.2835308}
}