Lechao Xiao (萧乐超)
I am a research scientist in Google DeepMind, NYC working on AI. Before that, I was a Hans Rademacher Instructor of Mathematics at the University of Pennsylvania. I received my PhD from the University of Illinois at Urbana-Champaign and my BA from Zhejiang University, Hangzhou, China. Here is my CV.
Email: XIAO dot HARMONIC at gmail dot com
Research Interests
My current focus is scaling. In particular, I am interested in improving training dynamics of large-scale neural networks. I am also interested in theory of deep learning, generalization, optimization, kernels, Gaussian processes, etc.
Before researching on AI, I worked on harmonic analysis: multilinear operators, oscillatory integrals, singular Radon-like transforms, time-frequency analysis and resolution of singularities.
Publications in Machine Learning
Small-scale proxies for large-scale Transformer training instabilities, Mitchell Wortsman, Peter J. Liu, Lechao Xiao, Katie Everett, Alex Alemi, Ben Adlam, John D. Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith, submitted
Precise Learning Curves and Higher-Order Scalings for Dot-product Kernel Regression, Lechao Xiao*, Hong Hu*, Theodor Misiakiewicz*, Yue Lu, Jeffrey Pennington, NeurIPS 2022
Eigenspace Restructuring: a Principle of Space and Frequency in Neural Networks, Lechao Xiao, COLT 2022
Fast Neural Kernel Embeddings for General Activations, Insu Han, Amir Zandieh, Jaehoon Lee, Roman Novak, Lechao Xiao, Amin Karbasi, NeurIPS, 2022
Dataset Distillation with Infinitely Wide Convolutional Networks, Timothy Nguyen, Roman Novak, Lechao Xiao, Jaehoon Lee, NeurIPS 2021
Finite Versus Infinite Neural Networks: an Empirical Study, Jaehoon Lee, Samuel Schoenholz, Jeffrey Pennington, Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein, NeurIPS 2020
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks, Wei Hu, Lechao Xiao, Ben Adlam, Jeffrey Pennington, NeurIPS 2020
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks, Wei Hu, Lechao Xiao, Jeffrey Pennington, ICLR 2020
Disentangling Trainability and Generalization in Deep Neural Networks, Lechao Xiao, Jeffrey Pennington, Samuel Schoenholz, ICML 2020
Neural Tangents: Fast and Easy Infinite Neural Networks in Python, Roman Novak*, Lechao Xiao*, Jiri Hron, Jaehoon Lee, Alexander A. Alemi, Jascha Sohl-Dickstein, Samuel S. Schoenholz*, ICLR 2020.
Wide Neural Networts of Any Depth Evolve As Linear Models Under Gradients Descent, Jaehoon Lee*, Lechao Xiao*, Samuel S. Schoenholz, Yasaman Bahri, Jascha Sohl-Dickstein, Jeffrey Pennington. NeurIPS 2019.
Dynamical isometry and a mean field theory of CNNs: How to train 10,000-layer vanilla convolutional neural networks, Lechao Xiao, Yasaman Bahri, Jascha Sohl-Dickstein, Samuel S. Schoenholz, Jeffrey Pennington. ICML 2018.
Publications in Mathematics
Oscillatory Loomis—Whitney and Projections of Sublevel Sets, with Maxim Gilula, Kevin O’Neill, Journal d'Analyse Mathématique 2021
Higher decay inequalities for multilinear oscillatory integrals, with M. Gilula and P.T. Gressman, Mathematical Research Letters, 25(3), 819-842, 2018
Endpoint estimates for one-dimensional oscillatory integral operators , Adv. in Math., 316 (2017), 255-291.
Sharp Estimates for Trilinear Oscillatory Integrals and an Algorithm of Two-dimensional Resolution of Singularities, Rev. Mat. Ibero., 33, No.1 (2017), 67-116.
Maximal Decay Inequalities for Trilinear Oscillatory Integrals of Convolution Type, with P.T. Gressman, J. Func. Anal., 271, No. 12 (2016), 3695 -3726.
Bilinear Hilbert Transforms Associated with Plane Curves, with J. Guo, J. of Geom. Anal., 26 (2016), no. 2, 967-995.
Uniform Estimates for Bilinear Hilbert Transforms and Bilinear Maximal Functions Associated to Polynomials, with X. Li, Amer. J. Math., 138, No. 4 (2016), 907-962.
Teaching
Penn
UIUC
Last Update: 12/25/2023