Yaoqing Yang

Postdoctoral Researcher


Department of EECS

UC Berkeley

Soda Hall 465

Berkeley, CA 94720

yqyang AT berkeley.edu

I am broadly interested in reliable and transparent ML. For example, I use theory and large-scale empirical analyses to study effective and robust "generalization metrics" that can dissect ML models and make these models more transparent and analyzable. As another example, I study reliability issues in distributed systems for ML and design coding-theoretic approaches to address these issues. I am also interested in structured data, such as 3D point clouds and graphs. I am on the academic job market this year!

Postdoc, RISE Lab, EECS, UC Berkeley.


BS, EE, Tsinghua.

Google Scholar | LinkedIn | CV


  • Our paper on "augmentations in graph contrastive learning: current methodological flaws & towards better practices" is accepted by WWW 2022.

  • Our paper on "taxonomizing local versus global structure in neural network loss landscapes" is accepted by NeurIPS 2021. Welcome to check our video and code.

  • Our paper on "improving semi-supervised federated learning by reducing the gradient diversity of models" is accepted by IEEE BigData 2021.

  • Dominic has earned his master degree from UC Berkeley. Congratulations! His thesis focuses on boundary thickness and boundary tilting and how they can help reveal "backdoors" in a neural network.

  • Our paper on "boundary thickness and robustness in learning models" is accepted by NeurIPS 2020.

  • Our paper on "serverless straggler mitigation using local error-correcting codes" is selected in the best paper finalists in ICDCS 2020!

  • Our paper on applying coded computing techniques to non-von Neumann computing architectures is published in the Proceedings of IEEE.

  • Our work FoldingNet is selected as a spotlight talk in CVPR! You can see a video here.

Selected publications

Taxonomizing local versus global structure in neural network loss landscapes

Yaoqing Yang, Liam Hodgkinson, Ryan Theisen, Joe Zou, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney

NeurIPS 2021

Summary: This paper experimentally demonstrates the long-standing conjecture that "local properties" of a loss landscape cannot dictate generalization. The study taxonomizes learning problems into "phases" by analyzing various generalization metrics obtained from the loss landscapes of neural networks, and it provides a formal way to divide and conquer typical failure modes of learning in the different phases.

Full paper | Code | Video

Improving semi-supervised federated learning by reducing the gradient diversity of models

Zhengming Zhang*, Yaoqing Yang*, Zhewei Yao*, Yujun Yan, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney

IEEE BigData 2021

Summary: Cell phone users who participate in federated learning often do not have the time to provide labels to their private data, making semi-supervised learning a practical alternative. This paper shows that the large dissimilarity between model gradients from different users could arise from the semi-labeled data and become an obstacle to semi-supervised federated learning.

Full paper | Code

Two sides of the same coin: Heterophily and oversmoothing in graph convolutional neural networks

Yujun Yan, Milad Hashemi, Kevin Swersky, Yaoqing Yang, Danai Koutra


Summary: Graph convolutional neural networks may perform worse when we increase the number of layers (oversmoothing problem) and when we feed in heterophilous graphs (heterophily problem). In this work, we show it theoretically and empirically that these two seemingly unrelated problems are closely related.

Full paper

A Dataset-dispersion Perspective on Reconstruction versus Recognition in Single-view 3D Reconstruction Networks

Yefan Zhou, Yiru Shen, Yujun Yan, Chen Feng, Yaoqing Yang

3DV 2021

Summary: A SVR model can be disposed towards recognition (classification-based) or reconstruction depending on how dispersed the training data becomes. In this paper, we propose "dispersion score", which is a data-driven metric used to measure the tendency of SVR models to perform recognition or reconstruction. It can also be used to diagnose problems from the training data and guide the design of data augmentation schemes.

Full paper | Code | Video

Effect of Model Size on Worst-Group Generalization

Alan Pham*, Eunice Chan*, Vikranth Srivatsa*, Dhruba Ghosh*, Yaoqing Yang, Yaodong Yu, Ruiqi Zhong, Joseph E. Gonzalez*, Jacob Steinhardt*

Preliminary version accepted by NeurIPS DistShift Workshop 2021

Summary: Prior work has suggested that overparameterization can hurt test accuracy on rare subgroups. Motivated by the fact that subgroup information is often unknown, we investigate the effect of model size on worst-group generalization under empirical risk minimization (ERM). Our systematic evaluation reveals that increasing model size does not hurt, and may help, worst-group test error under ERM.

Full paper

Serverless straggler mitigation using local error-correcting codes

Vipul Gupta*, Dominic Carrano*, Yaoqing Yang, Vaishaal Shankar, Thomas Courtade, Kannan Ramchandran

ICDCS 2020

Best Paper Finalists

Summary: Inexpensive cloud services, such as serverless computing, are often vulnerable to straggling nodes that increase end-to-end latency. We propose and implement simple yet principled coding approaches for straggler mitigation.

Full paper | Code

Boundary thickness and robustness in learning models

Yaoqing Yang, Rajiv Khanna, Yaodong Yu, Amir Gholami, Kurt Keutzer, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney

NeurIPS 2020

Summary: This paper introduces the notion of "boundary thickness" and shows that thin decision boundaries lead to overfitting (e.g., measured by the robust generalization gap between training and testing) and lower robustness. Also, welcome to check Dominic's thesis and see how we use boundary thickness to reveal "backdoors" hidden in a neural network.

Full paper | Code

Coded elastic computing

Yaoqing Yang, Matteo Interlandi, Pulkit Grover, Soummya Kar, Saeed Amizadeh, Markus Weimer

ISIT 2019

Summary: Cloud providers have recently introduced new offerings whereby spare computing resources are accessible at discounts compared to on-demand computing. Exploiting such an opportunity is challenging since such resources are accessed with low priority and can elastically leave (through preemption) and join the computation at any time. This paper designs a new technique called coded elastic computing, enabling distributed computations over these elastic resources.

Full Paper

Coded iterative computing using substitute decoding

Yaoqing Yang, Malhar Chaudhari, Pulkit Grover, Soummya Kar

ISIT 2018

Summary: Applying conventional linear codes to large-scale matrix operations can make sparse matrices dense, and codes with low-density generator matrices (LDGM) are often preferred. In this paper, we show a novel way of using LDGM codes called "substitute decoding". Applications of this new coding scheme include power iterations, truncated singular value decompositions, and gradient descent in the distributed setting.

Conference Paper | Full Paper

Foldingnet: Point cloud auto-encoder via deep grid deformation

Yaoqing Yang, Chen Feng, Yiru Shen, Dong Tian

CVPR 2018

Summary: In this work, a novel auto-encoder is proposed to address the challenge of unsupervised learning on point clouds. A novel folding-based decoder is used to deform a canonical 2D grid onto a point cloud's underlying 3D object surface. The proposed decoder structure is proved, in theory, to be a generic architecture that can reconstruct an arbitrary point cloud from a 2D grid.

Paper | Code | Video

Mining point cloud local structures by kernel correlation and graph pooling

Yiru Shen*, Chen Feng*, Yaoqing Yang, Dong Tian

CVPR 2018

Summary: Existing ML models on point clouds do not take full advantage of a point’s local neighborhood that contains fine-grained structural information. In this paper, we present novel operations to exploit local structures in a point cloud.

Paper | Code

Coded distributed computing for inverse problems

Yaoqing Yang, Pulkit Grover, Soummya Kar

NeurIPS 2017

Summary: In this paper, we utilize the emerging idea of "coded computation" to design a novel technique for solving linear inverse problems under specific iterative methods in a parallelized implementation affected by stragglers. The applications studied in this paper include personalized PageRank and sampling on graphs.

Paper | Arxiv Version

Computing linear transformations with unreliable components

Yaoqing Yang, Pulkit Grover, Soummya Kar

Transactions on Information Theory 2017

Summary: The work provides the first coding strategies that provably require fewer gates in scaling sense than replication for computing finite-field linear transforms with all computational nodes being error-prone. The main insight is that allowing all nodes to be error-prone necessitates repeated error suppression through the embedding of decoders inside the computation, resulting in a "coded computation" setup.

Full paper | Code

Rate distortion for lossy in-network linear function computation and consensus: Distortion accumulation and sequential reverse water-filling

Yaoqing Yang, Pulkit Grover, Soummya Kar

Transactions on Information Theory 2017

Summary: The work provides fundamental limits as well as achievable strategies on "distortion accumulation" in distributed linear computing problems. By successfully characterizing the overall distortion-rate function with accumulated distortion in a high-rate regime, we tighten earlier cut-set bounds by a factor that can be arbitrarily large even in simple line networks.

Full paper

Talks and seminars

Taxonomizing local versus global structure in neural network loss landscapes, ICSI C3PI Seminar, International Computer Science Institute, Oct 13, 2021.

Boundary thickness and robustness in learning models, Utah Data Science Club Seminar, University of Utah, Mar 12, 2021.

Boundary thickness and robustness in learning models, ECE Energy and Information Systems Seminar, Carnegie Mellon University, Oct 21, 2020.

Systematic study of neural network robustness against adversarial attacks, BDD Workshop, UC Berkeley, May 15, 2020.

Rethinking adversarial examples and non-robust features, RISE Lab Winter Retreat, Jan 17, 2020.

Coding methods for elastic and iterative computing, RISE Lab seminar, Mar 12, 2019

Coded elastic computing: theoretical framework, and experimental results on Apache REEF, ITA Workshop's Graduation Day Talk, UC San Diego, Feb 13, 2019.

FoldingNet point cloud auto-encoder, GAMES: Graphics And Mixed Environment Seminar, Jan 31, 2019.

Coding for speeding up distributed computing, ITA Workshop's Graduation Day Poster Presentation, UC San Diego, Feb 13, 2018.