Yihong Gu

About Me

I'm a fourth-year Ph.D. student in the Department of Operations Research and Financial Engineering at Princeton University. I'm fortunate to be advised by Prof. Jianqing Fan. Previously, I received my bachelor's degree from the Department of Computer Science and Technology at Tsinghua University in 2019. 

My research interests span statistics and machine learning, focusing mainly on nonparametric estimation, variable selection, neural networks, and causal inference. 

Here are a few questions I am working on,


Email: yihongg [at] princeton [dot] edu


Recent Papers

[d] Causality Pursuit from Heterogeneous Environments via Neural Adversarial Invariance Learning.  [arxiv]

with Cong Fang, Peter Bühlmann, Jianqing Fan


A universally applicable and purely data-driven methodological framework to pursue causality blind to any prior knowledge.

[c] Environment Invariant Linear Least Squares. [arxiv] [slides] [poster]

with Jianqing Fan, Cong Fang, Tong Zhang


The first paper to realize statistically efficient multi-environment invariant learning in the general linear model.

[b] Factor Augmented Sparse Throughput Neural Networks for High Dimensional Regression. [arxiv] [code] [JASA]

with Jianqing Fan. Journal of the American Statistical Association, Forthcoming.


Keywords: Factor model, High-dimensional Regression

We introduce the Factor Augmented Sparse Throughput (FAST) model, a versatile high-dimensional nonparametric regression model. We propose the FAST-NN estimator utilizing neural networks' scalability nature and show that the estimator can adapt to the low-dimensional structure of the FAST model in a minimax optimal way.


[a] How do noise tails impact on deep ReLU neural networks? [arxiv

with Jianqing Fan, Wen-Xin Zhou


Keywords: Heavy-tailed Noise, Huber loss, Lower Bounds, Robustness

We unveil another side of neural networks' powerful approximation ability -- it makes them vulnerable to heavy-tailed noise -- by establishing its matching lower and upper bounds for nonparametric regression under heavy-tailed noise: the least squares estimator incurs a cost associated with the noise moment index, and the adaptive Huber estimator can obtain a faster rate.