# Lechao Xiao （萧乐超）

# I am a research scientist in Google DeepMind, NYC working on AI. Before that, I was a Hans Rademacher Instructor of Mathematics at the University of Pennsylvania. I received my PhD from the University of Illinois at Urbana-Champaign and my BA from Zhejiang University, Hangzhou, China. Here is my CV.

Email: XIAO dot HARMONIC at gmail dot com

# Research Interests

My current focus is scaling. In particular, I am interested in improving training dynamics of large-scale neural networks. I am also interested in theory of deep learning, generalization, optimization, kernels, Gaussian processes, etc.

Before researching on AI, I worked on harmonic analysis: multilinear operators, oscillatory integrals, singular Radon-like transforms, time-frequency analysis and resolution of singularities.

## Publications in Machine Learning

Small-scale proxies for large-scale Transformer training instabilities, Mitchell Wortsman, Peter J. Liu, Lechao Xiao, Katie Everett, Alex Alemi, Ben Adlam, John D. Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith, submitted

Precise Learning Curves and Higher-Order Scalings for Dot-product Kernel Regression, Lechao Xiao*, Hong Hu*, Theodor Misiakiewicz*, Yue Lu, Jeffrey Pennington, NeurIPS 2022

# Eigenspace Restructuring: a Principle of Space and Frequency in Neural Networks, Lechao Xiao, COLT 2022

Fast Neural Kernel Embeddings for General Activations, Insu Han, Amir Zandieh, Jaehoon Lee, Roman Novak, Lechao Xiao, Amin Karbasi, NeurIPS, 2022

Dataset Distillation with Infinitely Wide Convolutional Networks, Timothy Nguyen, Roman Novak, Lechao Xiao, Jaehoon Lee, NeurIPS 2021

Finite Versus Infinite Neural Networks: an Empirical Study, Jaehoon Lee, Samuel Schoenholz, Jeffrey Pennington, Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein, NeurIPS 2020

The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks, Wei Hu, Lechao Xiao, Ben Adlam, Jeffrey Pennington, NeurIPS 2020

# Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks, Wei Hu, Lechao Xiao, Jeffrey Pennington, ICLR 2020

Disentangling Trainability and Generalization in Deep Neural Networks, Lechao Xiao, Jeffrey Pennington, Samuel Schoenholz, ICML 2020

Neural Tangents: Fast and Easy Infinite Neural Networks in Python, Roman Novak*, Lechao Xiao*, Jiri Hron, Jaehoon Lee, Alexander A. Alemi, Jascha Sohl-Dickstein, Samuel S. Schoenholz*, ICLR 2020.

Wide Neural Networts of Any Depth Evolve As Linear Models Under Gradients Descent, Jaehoon Lee*, Lechao Xiao*, Samuel S. Schoenholz, Yasaman Bahri, Jascha Sohl-Dickstein, Jeffrey Pennington. NeurIPS 2019.

Dynamical isometry and a mean field theory of CNNs: How to train 10,000-layer vanilla convolutional neural networks, Lechao Xiao, Yasaman Bahri, Jascha Sohl-Dickstein, Samuel S. Schoenholz, Jeffrey Pennington. ICML 2018.

## Publications in Mathematics

Oscillatory Loomis—Whitney and Projections of Sublevel Sets, with Maxim Gilula, Kevin O’Neill, Journal d'Analyse Mathématique 2021

Higher decay inequalities for multilinear oscillatory integrals, with M. Gilula and P.T. Gressman, Mathematical Research Letters, 25(3), 819-842, 2018

Endpoint estimates for one-dimensional oscillatory integral operators , Adv. in Math., 316 (2017), 255-291.

Sharp Estimates for Trilinear Oscillatory Integrals and an Algorithm of Two-dimensional Resolution of Singularities, Rev. Mat. Ibero., 33, No.1 (2017), 67-116.

Maximal Decay Inequalities for Trilinear Oscillatory Integrals of Convolution Type, with P.T. Gressman, J. Func. Anal., 271, No. 12 (2016), 3695 -3726.

Bilinear Hilbert Transforms Associated with Plane Curves, with J. Guo, J. of Geom. Anal., 26 (2016), no. 2, 967-995.

Uniform Estimates for Bilinear Hilbert Transforms and Bilinear Maximal Functions Associated to Polynomials, with X. Li, Amer. J. Math., 138, No. 4 (2016), 907-962.

# Teaching

## Penn

## UIUC

Last Update: 12/25/2023