Yihong Gu
About Me
I'm a fourth-year Ph.D. student in the Department of Operations Research and Financial Engineering at Princeton University. I'm fortunate to be advised by Prof. Jianqing Fan. Previously, I received my bachelor's degree from the Department of Computer Science and Technology at Tsinghua University in 2019.
My research spans algorithmic statistical learning, variable selection, robust statistics, and causal inference. I'm interested in understanding the fundamental limits and developing provably sample-efficient estimation methods to learn complicated associations and causality from data with little or no supervision of function forms and cause-effect knowledge. Here are a few topics I am working on:
Pursuing causality from heterogeneous environments: linear regression [c], nonparametric [d].
Towards efficient structure-agnostic estimation: regression under heavy-tailed errors [a], high-dimension regression [b], data-driven causality predictor [d].
Email: yihongg [at] princeton [dot] edu
Recent Papers
[c] Environment Invariant Linear Least Squares. [arxiv] [code] [slides] [poster]
with Jianqing Fan, Cong Fang, Tong Zhang. Annals of Statistics, to appear.
ASA Best Student Paper Award at the Business and Economic Statistics Session, 2024
IMS Hannan Graduate Student Travel Award, 2024
The first paper to realize statistically efficient multi-environment invariant learning in the general linear model.
[b] Factor Augmented Sparse Throughput Neural Networks for High Dimensional Regression. [arxiv] [code] [JASA]
with Jianqing Fan. Journal of the American Statistical Association, to appear.
Keywords: Factor model, High-dimensional Regression
We introduce the Factor Augmented Sparse Throughput (FAST) model, a versatile high-dimensional nonparametric regression model. We propose the FAST-NN estimator utilizing neural networks' scalability nature and show that the estimator can adapt to the low-dimensional structure of the FAST model in a minimax optimal way.
[a] How do Noise Tails Impact on Deep ReLU Networks? [arxiv]
with Jianqing Fan, Wen-Xin Zhou. Annals of Statistics, to appear.
Keywords: Heavy-tailed Noise, Huber loss, Lower Bounds, Robustness
We unveil another side of neural networks' powerful approximation ability -- it makes them vulnerable to heavy-tailed noise -- by establishing its matching lower and upper bounds for nonparametric regression under heavy-tailed noise: the least squares estimator incurs a cost associated with the noise moment index, and the adaptive Huber estimator can obtain a faster rate.
Awards
Charlotte Elizabeth Procter Fellowship, Princeton University, 2024 [News]
IMS Hannan Graduate Student Travel Award, Institute of Mathematical Statistics, 2024
ASA Best Student Paper Award, ASA Business and Economic Statistics Session, 2024
School of Engineering and Applied Science Award for Excellence, Princeton University, 2023 [News]