Zeroth Order Optimization: Theory and Applications to Deep Learning

Conference: CVPR 2020

Date and Time: June 15th, 2020 1-4 pm (PST)

Tutorial Video Link: To be announced. Stay tuned!


Pin-Yu Chen

(IBM Research)

Sijia Liu

(IBM Research)

Description of the Tutorial

Zeroth-order (ZO) optimization is increasingly embraced for solving big data and machine learning (ML) problems when explicit expressions of the gradients are difficult to compute or infeasible to obtain. It achieves gradient-free optimization by approximating the full gradient via efficient gradient estimators. Some recent important applications include: (1) generation of prediction-evasive, black-box adversarial attacks on deep neural networks, (2) generation of model-agnostic explanation from machine learning systems, (3) design of gradient or curvature regularized robust ML systems in a computationally-efficient manner, (4) automated ML and meta learning, (5) online network management with limited computation capacity, (6) parameter inference of black-box/complex systems, and (7) bandit optimization in which a player receives partial feedback in terms of loss function values revealed by her adversary.

This tutorial aims to provide a comprehensive introduction to recent advances in ZO optimization methods in both theory and applications. On the theory side, we will cover convergence rate and iteration complexity analysis of ZO algorithms and make comparisons to their first-order counterparts. On the application side, we will highlight one appealing application of ZO optimization to study the robustness of deep neural networks and computer vision tasks (e.g. image classification, object detection, and image captioning) -- practical and efficient adversarial attacks that generate adversarial examples from a black-box ML model, and design of robust ML systems by leveraging ZO optimization. We will also summarize potential research directions intersecting ZO optimization and AI research, big data challenges, and some open-ended data mining and ML problems.

Our tutorial aims to cover both theoretical aspects as well as practical applications, which are of interest to researchers, students, developers, and practitioners.

Tutorial Outline

This tutorial will be divided into two parts, each part takes 90 - 120 minutes.

I. First part: an introduction to zeroth order (gradient-free optimization): overview of recent advances in zeroth order optimization.

    1. ZO algorithms: iteration complexity versus query complexity for both convex and nonconvex optimization

      • ZO-GD, ZO-SGD, and ZO-signSGD

      • Variance reduced ZO algorithms

      • ZO operator splitting method for smooth + nonsmooth composite optimization

      • ZO operator splitting method for smooth + nonsmooth composite optimization

      • ZO adaptive momentum methods

      • ZO min-max (robust) learning

      • ZO distributed optimization

    2. Applications in machine learning, data mining and signal processing

      • Generation of model-agnostic explanations

      • Automated ML and meta-learning

      • Recommendation system

      • Network resource management

      • Big data analysis in bioinformatics

II. Second part: ZO optimization for adversarial robustness in deep learning

    1. Brief introduction to adversarial machine learning and robustness

      • What is adversarial example?

      • White-box vs black-box adversarial attacks and defenses

    2. ZO optimization and black-box attacks to deep neural networks

      • Connecting ZO algorithm to black-box adversarial attacks

      • Score-based black-box attacks

      • Decision-based black-box attacks

      • Performance comparisons of different ZO attack methods

      • Universal perturbation attacks

      • Data poisoning attacks

III. Concluding Remarks, Open Challenges, and Discussion

Presenter's Bio

Dr. Pin-Yu Chen is currently a research staff member at IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA. He is also the chief scientist of RPI-IBM AI Research Collaboration and PI of ongoing MIT-IBM Watson AI Lab projects. Dr. Chen received his Ph.D. degree in electrical engineering and computer science and M.A. degree in Statistics from the University of Michigan, Ann Arbor, USA, in 2016. He received his M.S. degree in communication engineering from National Taiwan University, Taiwan, in 2011 and B.S. degree in electrical engineering and computer science (undergraduate honors program) from National Chiao Tung University, Taiwan, in 2009.

Dr. Chen’s recent research is on adversarial machine learning and robustness of neural networks. His long-term research vision is building trustworthy machine learning systems. He has published more than 25 papers on trustworthy machine learning at major AI and machine learning conferences and has co-organized workshops on adversarial learning for machine learning and data mining such as KDD’19. His research interest also includes graph and network data analytics and their applications to data mining, machine learning, signal processing, and cyber security. He was the recipient of the Chia-Lun Lo Fellowship from the University of Michigan Ann Arbor. He received the NIPS 2017 Best Reviewer Award, and was also the recipient of the IEEE GLOBECOM 2010 GOLD Best Paper Award. Dr. Chen is currently on the editorial board of PLOS ONE.

At IBM Research, Dr. Chen has co-invented more than 20 U.S. patents. In 2019, he received two Outstanding Research Accomplishments on research in adversarial robustness and trusted AI, and one Research Accomplishment on research in graph learning and analysis.

Sijia Liu is a Research Staff Member at MIT-IBM Watson AI Lab, IBM Research. He received the Ph.D. degree (with All University Doctoral Prize) in Electrical and Computer Engineering from Syracuse University, Syracuse, NY, USA, in 2016. He was a Postdoctoral Research Fellow at the University of Michigan, before joining in IBM Research. His research interests include optimization theory, adversarial machine learning, deep learning, computer vision, network data analysis, and computational biology. He received the Best Student Paper Award at ICASSP'17. In 2019, he received IBM Outstanding Innovation Award (OIA)/Outstanding Technical Achievement Award (OTAA) for contributions to Towards Automating AI Lifecycle with AutoAI. He also received IBM Outstanding Research Accomplishments on Trustworthy AI and Graph Learning and Analysis. He has co-chaired several workshops on adversarial machine learning and AI safety at GlobalSIP'18, KDD'19, and IBM AI Research Week'19.