Video Action Model Robustness Analysis

A Large-Scale Analysis on Robustness in Action Recognition

CVPR, 2023

In this work, we performa a large-scale robustness analysis of existing CNN and transformer-based video models for action recognition. We focus on the simulation of real-world perturbations as opposed to adversarial with four different benchmarking datasets: HMDB-51, UCF-101, Kinetics400, and SomethingSomething. We evaluate six different state-of-the-art action recognition models against a total of 90 visual perturbations.

We hope this study will serve as a benchmark and guide future research in robust action-recognition learning.

Performance against robustness of action recognition models on UCF-101P. y-axis: relative robustness γr (higher is better), x-axis: accuracy on clean videos, Model names appended with P indicate it is a pre-trained version of the model, and the size of circle indicates FLOPs

Video Perturbations

We split visual perturbations into 5 categories: Noise, Camera Motion, Digital, Temporal and Blur. These have severities that range from 1-5.

Mean Performance: Datasets

The mean performance based on action recognition accuracy for the different perturbed datasets across all models.

Results

Examples

Code

Paper

UCF101-DS

Authors

Madeline Chantry Schiappa¹*, Naman Biyani²*, Prudvi Kamtam¹, Shruti Vyas¹,Hamid Palangi³, Vibhav Vineet³, Yogesh S. Rawat¹

CRCV, UCF¹, IIT Kanpur², Microsoft Research³

Citation

@inproceedings{robustness2022large,

title={Large-scale Robustness Analysis of Video Action Recognition Models},

author={Schiappa, Madeline C and Biyani, Naman and Kamtam, Prudvi and Vyas, Shruti and Palangi, Hamid and Vineet, Vibhav and Rawat, Yogesh},

booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition},

year={2023}

}

Page updated

Google Sites

Report abuse