Lechao Xiao （萧乐超）

I am a research scientist in Google DeepMind (legacy Google Brain), NYC working on AI. Before that, I was a Hans Rademacher Instructor of Mathematics at the University of Pennsylvania. I received my PhD from the University of Illinois at Urbana-Champaign and my BA from Zhejiang University, Hangzhou, China.

Email: XIAO dot HARMONIC at gmail dot com

Research Interests

My current focus is Scaling-centric Machine Learning.

I am also interested in theory of deep learning, generalization, optimization, training dynamics, kernels, Gaussian processes, etc.

In my prior life, I worked on harmonic analysis: multilinear operators, oscillatory integrals, singular Radon-like transforms, time-frequency analysis, and resolution of singularities.

Publications in Machine Learning

Rethinking Conventional Wisdom in Machine Learning: from Generalization to Scaling, Lechao Xiao

Scaling Exponents Across Parameterizations and Optimizers, Katie Everett, Lechao Xiao, Mitchell Wortsman, Alexander A. Alemi, Roman Novak, Peter J. Liu, Izzeddin Gur, Jascha Sohl-Dickstein, Leslie Pack Kaelbling, Jaehoon Lee, Jeffrey Pennington

4+3 Phases of Compute-Optimal Neural Scaling Laws, Elliot Paquette, Courtney Paquette, Lechao Xiao*, Jeffrey Pennington*

Small-scale proxies for large-scale Transformer training instabilities, Mitchell Wortsman, Peter J. Liu, Lechao Xiao, Katie Everett, Alex Alemi, Ben Adlam, John D. Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith, 2024 ICLR Oral

Synergy and symmetry in deep learning: Interactions between the data, model, and inference algorithm, Lechao Xiao, Jeffrey Pennington, ICML 2022.

Precise Learning Curves and Higher-Order Scalings for Dot-product Kernel Regression, Lechao Xiao*, Hong Hu*, Theodor Misiakiewicz*, Yue Lu, Jeffrey Pennington, NeurIPS 2022

Eigenspace Restructuring: a Principle of Space and Frequency in Neural Networks, Lechao Xiao, COLT 2022

Fast Neural Kernel Embeddings for General Activations, Insu Han, Amir Zandieh, Jaehoon Lee, Roman Novak, Lechao Xiao, Amin Karbasi, NeurIPS, 2022

Dataset Distillation with Infinitely Wide Convolutional Networks, Timothy Nguyen, Roman Novak, Lechao Xiao, Jaehoon Lee, NeurIPS 2021

Exploring the Uncertainty Properties of Neural Networks' Implicit Priors in the Infinite-Width Limit, Ben Adlam*, Jaehoon Lee*, Lechao Xiao*, Jeffrey Pennington, Jasper Snoek, ICLR 2021

Finite Versus Infinite Neural Networks: an Empirical Study, Jaehoon Lee, Samuel Schoenholz, Jeffrey Pennington, Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein, NeurIPS 2020

The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks, Wei Hu, Lechao Xiao, Ben Adlam, Jeffrey Pennington, NeurIPS 2020

Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks, Wei Hu, Lechao Xiao, Jeffrey Pennington, ICLR 2020

Disentangling Trainability and Generalization in Deep Neural Networks, Lechao Xiao, Jeffrey Pennington, Samuel Schoenholz, ICML 2020

Neural Tangents: Fast and Easy Infinite Neural Networks in Python, Roman Novak*, Lechao Xiao*, Jiri Hron, Jaehoon Lee, Alexander A. Alemi, Jascha Sohl-Dickstein, Samuel S. Schoenholz*, ICLR 2020.

Wide Neural Networts of Any Depth Evolve As Linear Models Under Gradients Descent, Jaehoon Lee*, Lechao Xiao*, Samuel S. Schoenholz, Yasaman Bahri, Jascha Sohl-Dickstein, Jeffrey Pennington. NeurIPS 2019.

Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes, Roman Novak*, Lechao Xiao*, Jaehoon Lee, Yasaman Bahri, Greg Yang, Daniel A. Abolafia, Jeffrey Pennington, Jascha Sohl-Dickstein. ICLR 2019.

Dynamical isometry and a mean field theory of CNNs: How to train 10,000-layer vanilla convolutional neural networks, Lechao Xiao, Yasaman Bahri, Jascha Sohl-Dickstein, Samuel S. Schoenholz, Jeffrey Pennington. ICML 2018.

Publications in Mathematics

Oscillatory Loomis—Whitney and Projections of Sublevel Sets, with Maxim Gilula, Kevin O’Neill, Journal d'Analyse Mathématique 2021

Higher decay inequalities for multilinear oscillatory integrals, with M. Gilula and P.T. Gressman, Mathematical Research Letters, 25(3), 819-842, 2018

Endpoint estimates for one-dimensional oscillatory integral operators , Adv. in Math., 316 (2017), 255-291.

Sharp Estimates for Trilinear Oscillatory Integrals and an Algorithm of Two-dimensional Resolution of Singularities, Rev. Mat. Ibero., 33, No.1 (2017), 67-116.

Maximal Decay Inequalities for Trilinear Oscillatory Integrals of Convolution Type, with P.T. Gressman, J. Func. Anal., 271, No. 12 (2016), 3695 -3726.

Bilinear Hilbert Transforms Associated with Plane Curves, with J. Guo, J. of Geom. Anal., 26 (2016), no. 2, 967-995.

Uniform Estimates for Bilinear Hilbert Transforms and Bilinear Maximal Functions Associated to Polynomials, with X. Li, Amer. J. Math., 138, No. 4 (2016), 907-962.

Last Update: 10/12/2024

Google Sites

Report abuse