Adversarial Machine Learning for Good

Tutorial Slides: click here to download

Presenter: Pin-Yu Chen (IBM Research)

Conference: AAAI 2022

Date and Time: February 23rd, 2022 (12:00 PM – 2:00 PM PST)

Description of the Tutorial

Adversarial machine learning (AdvML) is one of the most rapidly growing research fields in machine learning (ML) and artificial intelligence (AI). It studies adversarial robustness of state-of-the-art ML models such as neural networks, spanning from attacks that identify limitations of current ML systems, defenses that strengthen the model performance against various adversarial threats, to verification tools that quantify the level of robustness for different applications.

Beyond the recent advances in AdvML, this tutorial aims to provide fresh aspects on “what’s next in AdvML”, i.e., adversarial machine learning for good. The phrase “for good” has two-fold meanings – novel innovations and sustainability. First, this tutorial will introduce emerging and novel applications that leverage the lessons from AdvML to benefit mainstream ML tasks, which differ from the original objective of evaluating and improving adversarial robustness. The examples include (i) generating contrastive explanations and counterfactual examples; (ii) model reprogramming for data-efficient transfer learning; (iii) model watermarking and fingerprinting for AI governance and ownership regulation; and (iv) data cloaking for enhanced privacy. Second, with the explosive number of submissions related to adversarial robustness growing every year, this tutorial aims to discuss the sustainability of this young research field towards continuous and organic growth, in terms of research norms and ethics, current trends, open challenges, and future directions. The target audience will be ML/AI researchers who are familiar with AdvML, as well as researchers who are interested in entering this field. The speaker will also share his thoughts from industrial practices.

Tutorial Outline

Unlike conventional tutorials on adversarial machine learning (AdvML) that focus on adversarial attacks, defenses, or verification methods, this tutorial aims to provide a fresh overview of how the same technique can be used in totally different manners to benefit mainstream machine learning tasks and to facilitate sustainable growth in this research field. This tutorial will start by reviewing the recent advances in AdvML and then delve into novel innovations to other domains (beyond adversarial robustness) inspired from AdvML. In particular, we will cover several noteworthy innovations proposed in recent years and relate their success to AdvML, including (i) generation of contrastive explanations and counterfactual examples; (ii) model reprogramming for data-efficient transfer learning; (iii) model watermarking and fingerprinting for AI governance and ownership regulation; (iv) data cloaking for enhanced privacy; and (v) data augmentation for improving model generalization. Finally, this tutorial will discuss the sustainability of this research field towards continuous and organic growth, in terms of research norms and ethics, current trends, open challenges, and future directions.

The outline of the tutorial is listed as follows.

Current Trend: Introduction to recent advances in adversarial machine learning (attacks, defenses, verification)

Novel Innovations inspired by AdvML

  • Generation of contrastive explanations and counterfactual examples (Dhurandhar et al. 2018; Luss et al.2021)

  • Model reprogramming for data-efficient transfer learning (Tsai, Chen, and Ho 2020; Yang, Tsai, and Chen 2021; Vinod, Chen, and Das 2020)

  • Model watermarking and fingerprinting for AI governance and ownership regulation (Aramoon, Chen, and Qu 2021; Wang et al. 2021)

  • Data cloaking for enhanced privacy and data security (Shan et al. 2020; Sablayrolles et al. 2020)

  • Data augmentation for improving model generalization (Hsu et al. 2021)

  • Semantic shift detection in word embeddings (Gruppi, Adali, and Chen 2021)

  • Robust Text CAPTCHAs (Shao et al. 2021)

  • Molecule optimization (Hoffman et al. 2022)

Sustainability of research in AdvML

Concluding Remarks and Q&A

Presenter's Bio: <link>

Dr. Pin-Yu Chen is currently a research staff member at IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA. He is also the chief scientist of RPI-IBM AI Research Collaboration and PI of ongoing MIT-IBM Watson AI Lab projects. Dr. Chen received his Ph.D. degree in electrical engineering and computer science and M.A. degree in Statistics from the University of Michigan, Ann Arbor, USA, in 2016. He received his M.S. degree in communication engineering from National Taiwan University, Taiwan, in 2011 and B.S. degree in electrical engineering and computer science (undergraduate honors program) from National Chiao Tung University, Taiwan, in 2009.

Dr. Chen’s recent research focuses on adversarial machine learning and robustness of neural networks. His long-term research vision is building trustworthy machine learning systems. He has published more than 40 papers related to trustworthy machine learning at major AI and machine learning conferences, given tutorials at AAAI’22, IJCAI’21, CVPR(’20,’21), ECCV’20, ICASSP’20, KDD’19, and Big Data’18, and organized several workshops for adversarial machine learning. His research interest also includes graph and network data analytics and their applications to data mining, machine learning, signal processing, and cyber security. He was the recipient of the Chia-Lun Lo Fellowship from the University of Michigan Ann Arbor. He received a NeurIPS 2017 Best Reviewer Award, and was also the recipient of the IEEE GLOBECOM 2010 GOLD Best Paper Award. Dr. Chen is currently on the editorial board of PLOS ONE.

At IBM Research, Dr. Chen has co-invented more than 30 U.S. patents and received the honor of IBM Master Inventor. In 2021, he received an IBM Corporate Technical Award. In 2020, he received an IBM Research special division award for research related to COVID-19. In 2019, he received two Outstanding Research Accomplishments on research in adversarial robustness and trusted AI. He is also the recipient of several Research Accomplishment awards at IBM Research.


  • Aramoon, O.; Chen, P.-Y.; and Qu, G. 2021. Don’t Forget to Sign the Gradients! Proceedings of Machine Learning and Systems, 3.

  • Dhurandhar, A.; Chen, P.-Y.; Luss, R.; Tu, C.-C.; Ting, P.; Shanmugam, K.; and Das, P. 2018. Explanations based on the missing: Towards contrastive explanations with pertinent negatives. Neural Information Processing Systems.

  • Gruppi, M.; Adalı, S.; and Chen, P.-Y. 2021. Fake it Till You Make it: Self-Supervised Semantic Shifts for Monolingual Word Embedding Tasks. Proceedings of the AAAI Conference

  • on Artificial Intelligence.

  • Hsu, C.-Y.; Chen, P.-Y.; Lu, S.; Liu, S.; and Yu, C.-M. 2022. Adversarial Examples can be Effective Data Augmentation for Unsupervised Machine Learning. AAAI 2022

  • Luss, R.; Chen, P.-Y.; Dhurandhar, A.; Sattigeri, P.; Zhang, Y.; Shanmugam, K.; and Tu, C.-C. 2021. Leveraging Latent Features for Local Explanations. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 1139–1149.

  • Sablayrolles, A.; Douze, M.; Schmid, C.; and J´egou, H. 2020. Radioactive data: tracing through training. In International Conference on Machine Learning, 8326–8335. PMLR.

  • Shan, S.;Wenger, E.; Zhang, J.; Li, H.; Zheng, H.; and Zhao, B. Y. 2020. Fawkes: Protecting privacy against unauthorized deep learning models. In 29th fUSENIXg Security Symposium (USENIX Security 20), 1589–1604.

  • Shao, R.; Shi, Z.; Yi, J.; Chen, P.-Y.; and Hsieh, C.-J. 2021. Robust Text CAPTCHAs Using Adversarial Examples. arXiv preprint arXiv:2101.02483.

  • Tsai, Y.-Y.; Chen, P.-Y.; and Ho, T.-Y. 2020. Transfer learning without knowing: Reprogramming black-box machine learning models with scarce data and limited resources. In International Conference on Machine Learning, 9614–9624.

  • Vinod, R.; Chen, P.-Y.; and Das, P. 2020. Reprogramming Language Models for Molecular Representation Learning. arXiv preprint arXiv:2012.03460.

  • Wang, S.; Wang, X.; Chen, P.-Y.; Zhao, P.; and Lin, X. 2021. Characteristic Examples: High-Robustness, Low-Transferability Fingerprinting of Neural Networks. IJCAI.

  • Yang, C.-H. H.; Tsai, Y.-Y.; and Chen, P.-Y. 2021. Voice2Series: Reprogramming Acoustic Models for Time Series Classification. In International Conference on Machine Learning.

  • Hoffman, S. C., Chenthamarakshan, V., Wadhawan, K., Chen, P. Y., & Das, P. (2022). Optimizing molecules using efficient queries from property evaluations. Nature Machine Intelligence, 4(1), 21-31.