Yihong Gu

About Me

I'm a fourth-year Ph.D. student in the Department of Operations Research and Financial Engineering at Princeton University. I'm fortunate to be advised by Prof. Jianqing Fan. Previously, I received my bachelor's degree from the Department of Computer Science and Technology at Tsinghua University in 2019. 


My research spans algorithmic statistical learning, variable selection, robust statistics, and causal inference. I'm interested in understanding the fundamental limits and developing provably sample-efficient estimation methods to learn complicated associations and causality from data with little or no supervision of function forms and cause-effect knowledge. Here are a few topics I am working on:


Email: yihongg [at] princeton [dot] edu


Recent Papers

[d] Causality Pursuit from Heterogeneous Environments via Neural Adversarial Invariance Learning.  [arxiv] [code]

with Cong Fang, Peter Bühlmann, Jianqing Fan


A universally applicable and purely data-driven methodological framework to pursue causality blind to any prior knowledge.

[c] Environment Invariant Linear Least Squares. [arxiv] [code] [slides] [poster

with Jianqing Fan, Cong Fang, Tong Zhang. Annals of Statistics, to appear.


The first paper to realize statistically efficient multi-environment invariant learning in the general linear model.

[b] Factor Augmented Sparse Throughput Neural Networks for High Dimensional Regression. [arxiv] [code] [JASA]

with Jianqing Fan. Journal of the American Statistical Association, to appear.


Keywords: Factor model, High-dimensional Regression

We introduce the Factor Augmented Sparse Throughput (FAST) model, a versatile high-dimensional nonparametric regression model. We propose the FAST-NN estimator utilizing neural networks' scalability nature and show that the estimator can adapt to the low-dimensional structure of the FAST model in a minimax optimal way.


[a] How do Noise Tails Impact on Deep ReLU Networks? [arxiv

with Jianqing Fan, Wen-Xin Zhou. Annals of Statistics, to appear.


Keywords: Heavy-tailed Noise, Huber loss, Lower Bounds, Robustness

We unveil another side of neural networks' powerful approximation ability -- it makes them vulnerable to heavy-tailed noise -- by establishing its matching lower and upper bounds for nonparametric regression under heavy-tailed noise: the least squares estimator incurs a cost associated with the noise moment index, and the adaptive Huber estimator can obtain a faster rate.

Awards