Attack

Team: Shriya Dale, Anastasia Gukasova, Nadia Jason, Mohnish Sonsare and Hailey Tien

Faculty: Soheil Feizi and Mazda Moayeri

AI4ALL Facilitators: Priscilla Wu and Maryann Vazhapilly

Project Overview:

Deep learning has been an area of constant research, especially in the past ten years. Current areas of utilization include self-automated vehicles and various other high-risk domains. However, a trained model with an extremely high accuracy can be vulnerable to adversarial inputs: inputs that have been slightly modified by some noise. This poses various risks in our society, as a car could interpret a stop sign as a green light with just a sticker. Studying adversarial examples and how to train models against them was the primary focus of this project.


Project Question:

How can neural networks be trained to overcome adversarial attacks effectively? What are the advantages of an adversarially robust network in terms of robustness accuracy and perceptually aligned gradients?

action steps:

The team first built a classifier for the CIFAR10 dataset, which is an image dataset. This neural network was trained alongside a pre-existing neural network, which was the Resnet18 model on Pytorch. After the model was trained and tested with high accuracy results, an adversarial attack was implemented, reducing the accuracy of the model drastically. The same model was then adversarially trained by implementing adversarial images as part of the training dataset. The accuracy of the adversarially trained model was then evaluated using the robustness package developed by the MIT MadryLab.

deliverables:

Students built a convolutional neural network for the CIFAR10 dataset, implemented an adversarial attack, along with adversarial training. Novel images using the adversarially trained network and the adversarial attack were also created.

results:

As seen on the graph, the robust model is much more suitable for attacks. The robust accuracy declines sharply for the standard model, which makes it not preferable.

Novel images were created using adversarial attacks. These images were transformed based on the target class.

Attack on Titans

Thank you for going through our presentation! Please fill our our feedback form:

zoom_0.mp4