Yufan Li
yufan_li (at) g (dot) harvard (dot) edu
Department of Statistics, Harvard University
1 Oxford St, Cambridge, MA, 02138
Education
Harvard University, Department of Statistics, Cambridge, MA
Ph.D. Candidate in Statistics (currently 4-th year), Aug. 2020 - June 2025 (expected)
Advisors: Subhabrata Sen , Pragya Sur
Harvard University, SEAS, Cambridge, MA
M.E. in Computational Sciences & Engineering, Aug. 2018 -June 2020
University of Toronto, Applied Science & Engineering, Toronto, ON
B.A.Sc. in Engineering Science (High Honors), Sep.2013 -May 2018
Research Interests
High Dimensional Statistics, Machine Learning Theory, Approximate Message Passing (AMP), Statistical Physics (Mean-Field Spin Glasses), Online Learning & Bandits
High Dimensional Statistics & Probability
Spectrum-Aware Adjustment: A New Debiasing Framework with Application to Principal Component Regression, with Pragya Sur [in submission]
Investigated how to de-bias regularized estimators (e.g. LASSO, Elastic Net) using "one-step estimator". The insight lies in applying a scalar adjustment coefficient to the step size that appropriately addresses high-dimensionality and spectral properties of the design matrix;
Leveraged the adjustment insight to de-bias Principal Component Regression for high dimensional inference; the method performs well on design matrices with complex global dependence (e.g. time series, fat-tails, latent low-rank, linear networks, asymmetric), as well as on various real datasets in data science and statistics (e.g. genetics, audio & image, financial returns, socio-economics, demand-forecast indicators)
Random Linear Estimation with Rotationally-Invariant Designs: Asymptotics at High Temperature, with Zhou Fan, Subhabrata Sen & Yihong Wu [accepted at IEEE Transactions on Information Theory]
Studied information-theoretic properties of Bayes-optimal estimator in high-dimensional Bayesian linear regression; verified "single-letter" formulas that characterize mutual information, MMSE under a high temperature assumption on signal-to-noise ratio;
Technically, we used vector AMP iterates to track Bayes-optimal estimator and computed moments of log partition function using large deviation analysis techniques
TAP Equations for Orthogonally Invariant Spin Glasses at High Temperature, with Zhou Fan & Subhabrata Sen [in submission]
Proved TAP equations for mean-field spin glasses exhibiting global correlation in spin interaction at high temperature. TAP equations describe marginals of Gibbs measure and are fundamental to AMP algorithms and the nascent TAP variational inference methods. Our proof provides the first confirmation of TAP equations for orthogonal ensemble since Parisi-Potters' conjecture in the 90s;
Technically, we exploited connection between TAP equations and high dimensional geometry of the Gibbs measure as a "spherical band"
Machine Learning
Understanding Optimal Feature Transfer via a Fine-Grained Bias-Variance Analysis, with Ben Adlam & Subhabrata Sen [in submission]
Investigate the transfer representation learning paradigm in a simple, exactly solvable model where the feature layer is pretrained on upstream data and transferred to an ensemble of downstream tasks
Study structures of optimally pretrained kernel and how they correspond to a fine-grained bias-variance tradeoff.
"Solvable" Batched Bandits: Balance Risk and Reward in Phased Release Problem, with Iavor Bojinov & Jialiang Mao [accepted at NeurIPS 2023]
Designed a batched bandit algorithm for a novel online-decision making problem: determine optimal release schedule of new product updates under a budget constraint of adverse treatment effects (i.e. not releasing bad updates to too many users);
Our approach decomposes "risk-of-ruin" (probability of budget depletion) recursively and solves for optimal choice-of-arms analytically from a sequence of simple quadratic equations. Using only sample means and variances of the online outcomes, our method bypasses challenging rare-event simulations and are highly efficient and parallelizable.
Conferences & Summer Schools
Machine Learning Summer School at Princeton University, 2023 June
IEEE International Symposium on Information Theory (ISIT) Taipei, 2023 June
Advances of Probabilistic Algorithms, 35-th New England Statistics Symposium (NESS), 2022 May
Deep Learning Theory Summer School at Princeton University, 2021 July
Consulting, Reading Group & Teaching
Statistical Consultant, Harvard Statistics Consulting Service, Dec. 2021-Present
Weekly 2 -hr consultation sessions for researchers across disciplines (e.g., health care, chemistry, social sciences); Consult clients on ML, Bayesian analysis and high dimensionality; Followed a biomed. case seeking to recove sleep conditions of schizophrenia patients from biometric data. Proposed an Hidden Markov Model (HMM) solution to the client and provided satisfactory results.
Probability & Math. Physics Reading Group, HT Yau's group, Dec. 2021-Present
Deliver technical presentations on advanced topics: quantum ergodicity (notes), spectral graph theory (notes), spin glass Hamiltonian optimization (notes); participate in the weekly group gatherings
Teaching Fellow, Department of Statistics, Harvard University, 2021-Present
Probability II (PhD), Inference I (PhD), Data Science I (UG/Masters), Statistics and Data Science for Networks (UG);