Yaoqing Yang
Assistant Professor
Department of CS
Dartmouth College
15 Thayer Drive
Hanover, NH 03755-4404
Yaoqing.Yang AT dartmouth.edu
I am interested in making ML more reliable and transparent. For example, I use theory and large-scale empirical analyses to study effective and robust "generalization metrics" that can dissect ML models and make these models more transparent and analyzable. As another example, I study reliability issues in distributed systems for ML and design coding-theoretic approaches to address these issues. I also apply these studies to practical data analytics, such as 3D point clouds and graphs.
Welcome to drop me an email if you are interested in working with me. Please apply to our PhD program using the link below.
More information about me.
Postdoc, RISE Lab, EECS, UC Berkeley.
PhD, ECE, CMU.
BS, EE, Tsinghua.
Google Scholar | CV | LinkedIn
News
I will serve as a PC member @ IJCAI 2023.
I am teaching CS078/278 Deep Learning this term at Dartmouth.
One paper "two sides of the same coin: heterophily and oversmoothing in graph convolutional neural networks" is accepted by ICDM 2022.
One paper "neurotoxin: durable backdoors in federated learning" is accepted by ICML 2022.
Our paper "self-supervised spatial reasoning on multi-view line drawings" is accepted by CVPR 2022.
Our paper "evaluating natural language processing models with generalization metrics that do not need access to any training or testing data" is online.
Our paper on "augmentations in graph contrastive learning: current methodological flaws & towards better practices" is accepted by WWW 2022.
Our paper on "taxonomizing local versus global structure in neural network loss landscapes" is accepted by NeurIPS 2021. Welcome to check our video and code.
Our paper on "improving semi-supervised federated learning by reducing the gradient diversity of models" is accepted by IEEE BigData 2021.
Dominic has earned his master degree from UC Berkeley. Congratulations! His thesis focuses on boundary thickness and boundary tilting and how they can help reveal "backdoors" in a neural network.
Our paper on "boundary thickness and robustness in learning models" is accepted by NeurIPS 2020.
Our paper on "serverless straggler mitigation using local error-correcting codes" is selected in the best paper finalists in ICDCS 2020!
Our paper on applying coded computing techniques to non-von Neumann computing architectures is published in the Proceedings of IEEE.
Our work FoldingNet is selected as a spotlight talk in CVPR! Welcome to check our video.
Selected publications
Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data
Yaoqing Yang, Ryan Theisen, Liam Hodgkinson, Joseph E. Gonzalez, Kannan Ramchandran, Charles H. Martin, Michael W. Mahoney
Preprint
Summary: We provide the first large-scale correlational studies on the generalization measures for natural language processing models. This paper focuses on the measures derived from the heavy-tail self regularization (HT-SR) theory, which does not need access to training or testing data to calculate. Also, we show that these measures can perform uniformly better than existing norm-based measures if we aim to predict test-time performance instead of the "generalization gap", which is the difference between training and test accuracies. We use the WeightWatcher toolbox to analyze the HT-SR measures.
Two sides of the same coin: Heterophily and oversmoothing in graph convolutional neural networks
Yujun Yan, Milad Hashemi, Kevin Swersky, Yaoqing Yang, Danai Koutra
ICDM 2022
Summary: Graph convolutional neural networks may perform worse when we increase the number of layers (oversmoothing problem) and when we feed in heterophilous graphs (heterophily problem). In this work, we show it theoretically and empirically that these two seemingly unrelated problems are closely related.
Neurotoxin: Durable backdoors in federated learning
Zhengming Zhang*, Ashwinee Panda*, Linyue Song, Yaoqing Yang, Michael W. Mahoney, Prateek Mittal, Kannan Ramchandran, Joseph E. Gonzalez
ICML 2022
Summary: We propose Neurotoxin, a simple one-line modification to existing backdoor attacks in federated learning. Our attack can double the durability of state of the art backdoors.
Self-supervised spatial reasoning on multi-view line drawings
Siyuan Xiang*, Anbang Yang*, Yanfei Xue, Yaoqing Yang, Chen Feng
CVPR 2022
Summary: This paper studies self-supervised learning algorithms that can perform "spatial reasoning" tasks from multi-view images of line drawings. Our algorithms significantly exceed the state-of-the-art performance when measured on the newly proposed SPARE3D dataset.
Full paper | Website | Code
Taxonomizing local versus global structure in neural network loss landscapes
Yaoqing Yang, Liam Hodgkinson, Ryan Theisen, Joe Zou, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney
NeurIPS 2021
Summary: This paper experimentally demonstrates the long-standing conjecture that "local properties" of a loss landscape cannot dictate generalization. The study taxonomizes learning problems into "phases" by analyzing various generalization metrics obtained from the loss landscapes of neural networks, and it provides a formal way to divide and conquer typical failure modes of learning in the different phases.
Full paper | Code | Video
Improving semi-supervised federated learning by reducing the gradient diversity of models
Zhengming Zhang*, Yaoqing Yang*, Zhewei Yao*, Yujun Yan, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney
IEEE BigData 2021
Summary: Cell phone users who participate in federated learning often do not have the time to provide labels to their private data, making semi-supervised learning a practical alternative. This paper shows that the large dissimilarity between model gradients from different users could arise from the semi-labeled data and become an obstacle to semi-supervised federated learning.
A Dataset-dispersion Perspective on Reconstruction versus Recognition in Single-view 3D Reconstruction Networks
Yefan Zhou, Yiru Shen, Yujun Yan, Chen Feng, Yaoqing Yang
3DV 2021
Summary: A SVR model can be disposed towards recognition (classification-based) or reconstruction depending on how dispersed the training data becomes. In this paper, we propose "dispersion score", which is a data-driven metric used to measure the tendency of SVR models to perform recognition or reconstruction. It can also be used to diagnose problems from the training data and guide the design of data augmentation schemes.
Full paper | Code | Video
Effect of Model Size on Worst-Group Generalization
Alan Pham*, Eunice Chan*, Vikranth Srivatsa*, Dhruba Ghosh*, Yaoqing Yang, Yaodong Yu, Ruiqi Zhong, Joseph E. Gonzalez*, Jacob Steinhardt*
Preliminary version accepted by NeurIPS DistShift Workshop 2021
Summary: Prior work has suggested that overparameterization can hurt test accuracy on rare subgroups. Motivated by the fact that subgroup information is often unknown, we investigate the effect of model size on worst-group generalization under empirical risk minimization (ERM). Our systematic evaluation reveals that increasing model size does not hurt, and may help, worst-group test error under ERM.
Boundary thickness and robustness in learning models
Yaoqing Yang, Rajiv Khanna, Yaodong Yu, Amir Gholami, Kurt Keutzer, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney
NeurIPS 2020
Summary: This paper introduces the notion of "boundary thickness" and shows that thin decision boundaries lead to overfitting (e.g., measured by the robust generalization gap between training and testing) and lower robustness. Also, welcome to check Dominic's thesis and see how we use boundary thickness to reveal "backdoors" hidden in a neural network.
Foldingnet: Point cloud auto-encoder via deep grid deformation
Yaoqing Yang, Chen Feng, Yiru Shen, Dong Tian
CVPR 2018
Summary: In this work, a novel auto-encoder is proposed to address the challenge of unsupervised learning on point clouds. A novel folding-based decoder is used to deform a canonical 2D grid onto a point cloud's underlying 3D object surface. The proposed decoder structure is proved, in theory, to be a generic architecture that can reconstruct an arbitrary point cloud from a 2D grid.
Mining point cloud local structures by kernel correlation and graph pooling
Yiru Shen*, Chen Feng*, Yaoqing Yang, Dong Tian
CVPR 2018
Summary: Existing ML models on point clouds do not take full advantage of a point’s local neighborhood that contains fine-grained structural information. In this paper, we present novel operations to exploit local structures in a point cloud.
Serverless straggler mitigation using local error-correcting codes
Vipul Gupta*, Dominic Carrano*, Yaoqing Yang, Vaishaal Shankar, Thomas Courtade, Kannan Ramchandran
ICDCS 2020
Best Paper Finalists
Summary: Inexpensive cloud services, such as serverless computing, are often vulnerable to straggling nodes that increase end-to-end latency. We propose and implement simple yet principled coding approaches for straggler mitigation.
Coded elastic computing
Yaoqing Yang, Matteo Interlandi, Pulkit Grover, Soummya Kar, Saeed Amizadeh, Markus Weimer
ISIT 2019
Summary: Cloud providers have recently introduced new offerings whereby spare computing resources are accessible at discounts compared to on-demand computing. Exploiting such an opportunity is challenging since such resources are accessed with low priority and can elastically leave (through preemption) and join the computation at any time. This paper designs a new technique called coded elastic computing, enabling distributed computations over these elastic resources.
Coded iterative computing using substitute decoding
Yaoqing Yang, Malhar Chaudhari, Pulkit Grover, Soummya Kar
ISIT 2018
Summary: Applying conventional linear codes to large-scale matrix operations can make sparse matrices dense, and codes with low-density generator matrices (LDGM) are often preferred. In this paper, we show a novel way of using LDGM codes called "substitute decoding". Applications of this new coding scheme include power iterations, truncated singular value decompositions, and gradient descent in the distributed setting.
Coded distributed computing for inverse problems
Yaoqing Yang, Pulkit Grover, Soummya Kar
NeurIPS 2017
Summary: In this paper, we utilize the emerging idea of "coded computation" to design a novel technique for solving linear inverse problems under specific iterative methods in a parallelized implementation affected by stragglers. The applications studied in this paper include personalized PageRank and sampling on graphs.
Computing linear transformations with unreliable components
Yaoqing Yang, Pulkit Grover, Soummya Kar
Transactions on Information Theory 2017
Summary: The work provides the first coding strategies that provably require fewer gates in scaling sense than replication for computing finite-field linear transforms with all computational nodes being error-prone. The main insight is that allowing all nodes to be error-prone necessitates repeated error suppression through the embedding of decoders inside the computation, resulting in a "coded computation" setup.
Rate distortion for lossy in-network linear function computation and consensus: Distortion accumulation and sequential reverse water-filling
Yaoqing Yang, Pulkit Grover, Soummya Kar
Transactions on Information Theory 2017
Summary: The work provides fundamental limits as well as achievable strategies on "distortion accumulation" in distributed linear computing problems. By successfully characterizing the overall distortion-rate function with accumulated distortion in a high-rate regime, we tighten earlier cut-set bounds by a factor that can be arbitrarily large even in simple line networks.
Talks and seminars
Invited talk at the Bebop meeting at UC Berkeley, December 7, 2022.
Invited online talk at Princeton University, October 28, 2022.
Invited online talk at Carnegie Mellon University, October 12, 2022.
Internal talk at Lawrence Berkeley National Laboratory, October 6, 2022.
Seminar talk at Tsinghua University, AIR Discover, September 25, 2022.
Seminar talk at the University of Arizona, April 12, 2022.
Seminar talk at Department of Mathematics, Nanjing University, April 11, 2022.
Seminar talk at the University of Florida, Mar 24, 2022.
Seminar talk at the Chinese University of Hong Kong, Mar 22, 2022.
Seminar talk at Washington University in St. Louis, Mar 10, 2022.
Invited online talk at AI-TIME, Mar 9, 2022.
Invited online talk, ELLIS reading group on Mathematics of Deep Learning, Mar 8, 2022.
Seminar talk at Dartmouth College, Mar 2, 2022.
Seminar talk at the Hong Kong University of Science and Technology, Feb 23, 2022.
Invited online talk, EIS Seminar, Carnegie Mellon University, Feb 21, 2022.
ICSI C3PI Seminar, International Computer Science Institute, Oct 13, 2021.
Utah Data Science Club Seminar, University of Utah, Mar 12, 2021.
ECE Energy and Information Systems Seminar, Carnegie Mellon University, Oct 21, 2020.
Talk at BDD Workshop, UC Berkeley, May 15, 2020.
Talk at RISE Lab Winter Retreat, Jan 17, 2020.
Invited Seminar, RISE Lab, Mar 12, 2019.
ITA Workshop's Graduation Day Talk, UC San Diego, Feb 13, 2019.
GAMES: Graphics And Mixed Environment Seminar, Jan 31, 2019.
Invited talk, University of Washington, Aug 9, 2018.
ITA Workshop's Graduation Day Poster Presentation, UC San Diego, Feb 13, 2018.