Practical Adversarial Robustness in Deep Learning: Problems and Solutions

By Pin-Yu Chen (IBM Research) & Sayak Paul (Carted)

CVPR 2021 | June 20, 2021 | GitHub Repository | Slides

Abstract

Deep learning has brought us tremendous achievements in many fields such as computer vision, natural language processing. In spite of the impeccable success, modern deep learning systems are still prone to adversaries. Let's talk in terms of computer vision. Consider an image of a bagel (X). A Deep learning-based image classifier is able to successfully recognize X as a bagel. Now consider another instance of the same image which is a slightly perturbed version of X. To the human eyes, it would still be a bagel but for that same image classifier, it can be presented as a grand piano.

The real-world implications of these in mission-critical systems including transportation, autonomous vehicles, and healthcare can be really adverse.

The focus of this tutorial is not just to survey different attack types but also how to employ them in practice, to look at various state-of-the-art pre-trained models, optimizers and test their susceptibility to these attacks, and then to employ the latest and best techniques to prevent adversarial attacks, thanks to the principles in adversarial learning. This tutorial also aims to provide a holistic and complementary overview of how the same adversarial technique can be used in totally different manners, for good and (unintentionally) for bad, so that AI researchers and developers can have a fresh perspective and some reflection on the induced impacts and responsibility. To make some concrete examples, generative adversarial networks (GANs) are capable of generating photo-realistic synthetic images; but the same techniques can be repurposed into troublesome tools such as Deepfake. On the other hand, adversarial attacks causing prediction evasion are often related to a trouble maker or a security outbreak, but the same techniques are used for improving model robustness and for novel applications, such as adversarial training, privacy-enhanced training, data augmentation, watermarking, and integrity testing, to name a few.

Outline

This tutorial is structured in the following way:

The upside of the AI coin
- Recent advances in AI technology
- A brief introduction to deep learning and notable application in computer vision
- AI lifecycle and industrial use cases
The downside of the AI coin

Examples of misuse of AI
Examples of using AI for malicious purposes
AI ethics and the induced costs

Adversarial AI
- A holistic view of vulnerabilities in AI/ML systems
- Risks including privacy, model theft, integrity, security, and robustness
- The fundamental premise of adversarial perturbations and the importance of dealing with them
- Types of adversarial examples and perturbations
  - Natural
  - Synthetic
    - We will mainly cover ℓ_pattacks.
Susceptibility of different optimizers to adversarial perturbations
Adversarial training and defenses
- Fundamentals of adversarial training
- Adversarial regularization loss with empirical risk minimization
- Adversarial training as a way to combat overfitting
- Byproducts of adversarial training
- Defense with certified robustness
Interpreting adversarial examples
- Visualizations with GradCAM
- Feature denoising
Recipes that show promise for adversarial training
- For natural adversarial examples
  - Self-attention
  - Model capacity
- For synthetically generated adversarial examples
  - Noisy student training
  - Smooth adversarial training
Repurposing adversarial AI for good
- Watermarking, neural fingerprint, and Integrity testing
- Privacy-enhanced machine learning
Conclusion and Q&A
- Concluding remarks
- Available resources for research in adversarial AI
- Research challenges and open-ended questions
- Rethinking and reinforcing ethics in AI research

Tutorial Materials

Demonstration code and the slides are available on our GitHub repository. We use the following libraries for code:

TensorFlow
Keras
Neural Structured Learning
Foolbox