Research
Research Funding
Our proposal got supported by NCCR Automation under Swiss NSF. It supports a PhD for four years.
Research Directions
Solving Nonconvex Optimization to Global Optimality and Applications in Supply Chain and Revenue Management
Many nonconvex network revenue management, inventory control, and reinforcement learning problems are well-structured and admit a special structure known as hidden convexity, i.e., there exists a convex reformulation via a variable transformation. However, the transformation from the nonconvex problem to the convex reformulation may be unknown or involve unknown distributions. Thus it is still hard to directly solve the convex counterpart to global optimality. In this direction, we investigate how to design easy-to-implement global converging algorithms that directly solve the nonconvex optimization. Various applications, including booking limit control in quantity-based network revenue management in airlines business, pricing-based network revenue management, inventory system, convex reinforcement learning, all fall into such a problem category.
(Alphabetical) Xin Chen, Niao He, Yifan Hu, and Zikun Ye. Efficient Algorithms for Minimizing Compositions of Convex Functions and Random Functions and Its Applications in Network Revenue Management. 2022. Minor revision at Operations Research. [ArXiv]
(Alphabetical) Ilyas Fatkhullin, Niao He, Yifan Hu. Stochastic Optimization under Hidden Convexity. Preliminary version in NeurIPS Optimization for Machine Learning Workshop 2023. Journal version submitted. [ArXiv]
Stochastic Optimization and Machine Learning with Biased Oracles
Stochastic gradient decent has become the engine for modern machine learning and artificial intelligence. However, a wide range of such problems do not have easily accessible unbiased gradient estimators, especially when one cares also about personalization, robustness, and privacy. For instance, distributionally robust optimization, policy gradient methods in reinforcement learning, generative adversarial network, end-to-end learning, causal optimal transport, personalized learning, meta-learning and many others. These problems all share a common feature: one can construct gradient estimators with small bias using large number of samples or high computation costs. A natural question arises: is it really necessary that we have to pay high costs to reduce the bias in a learning system? In this project, we study the tradeoff between the bias, variance, and cost for stochastic optimization and machine learning with biased oracles, aiming to reduce the total sampling and computational costs.
Yifan Hu, Jie Wang, Yao Xie, Andreas Krause, Daniel Kuhn. Contextual Stochastic Bilevel Optimization. NeurIPS 2023. [Link]
Yifan Hu, Xin Chen, and Niao He. On the Bias-Variance-Cost Tradeoff of Biased Stochastic Optimization. NeurIPS 2021. [Link]
Yifan Hu*, Siqi Zhang*, Xin Chen, and Niao He. Biased Stochastic First-order Methods for Conditional Stochastic Optimization and Its Applications in Meta Learning. NeurIPS 2020. [Link]
Yifan Hu, Xin Chen, and Niao He. Sample Complexity of Sample Average Approximation for Conditional Stochastic Optimization. SIAM Journal on Optimization 2020. [Link]
Siqi Zhang*, Yifan Hu*, Liang Zhang, Niao He. Generalization Bounds of Nonconvex-(Strongly)-Concave Stochastic Minimax Optimization. AISTATS 2024. [Link]
Vinzenz Thoma, Barna Pasztor, Andreas Krause, Giorgia Ramponi, Yifan Hu. Stochastic Bilevel Optimization with Lower-Level Contextual Markov Decision Processes. [ArXiv]
Decision Making with Side Information/Contextual Optimization
In this direction, we moves beyond the classical stochastic bilevel optimization model to consider when the lower-level problem aims to minimize a conditional expectation objective. In particularly, it covers two scenarios that classical model does not: 1) when there are multiple followers such as in meta-learning, personalization, platform operations, and transportation, 2) when the follower makes a best response not only to the leader's decision but also some global uncertainty, such as in end-to-end learning, optimization with side information, and causal optimal transport. We aim to solve all these problems in one unified framework called contextual stochastic bilevel optimization. The challenge lies in that nearly all existing single-loop stochastic bilevel optimization methods are not applicable and all double-loop methods admit far sub-optimal complexity. Our work focuses on designing algorithms for such problems with optimal theoretical guarantees and efficient implementation.
Yifan Hu, Jie Wang, Yao Xie, Andreas Krause, Daniel Kuhn. Contextual Stochastic Bilevel Optimization. NeurIPS 2023. [Link]
Vinzenz Thoma, Barna Pasztor, Andreas Krause, Giorgia Ramponi, Yifan Hu. Stochastic Bilevel Optimization with Lower-Level Contextual Markov Decision Processes. [ArXiv]
Optimization for Causal Inference
In this line of research, I collaborate with statisticians to address open hard computational problems arising from statistics such as instrumental variable regression and causal inference.
Xuxing Chen, Abhishek Roy, Yifan Hu, Krishna Balasubramanian. Stochastic Optimization Algorithms for Instrumental Variable Regression with Streaming Data. [ArXiv]
Robust and Safe Reinforcement Learning and LLMs
Online learning, involving both (contextual) bandit and reinforcement learning, has revolutionized various applications, such as autonomous driving, large language model generations, protein folding, and clinical trials assignment. As these artificial intelligence systems get more and more involved in our daily life, we are inevitably in a situation to build more robust, more reliable, safer, and more sustainable artificial intelligence systems. To achieve these goals, I target at safety and robustness in online learning. The robustness aims to enhance the system to behavior well even in unseen scenarios while the safety part aims to ensure that the system satisfies both explict and implicit safe constraints.
Shyam Sundhar Ramesh, Pier Giuseppe Sessa, Yifan Hu, Andreas Krause, Ilija Bogunovic. Distributionally Robust Model-based Reinforcement Learning with Large State Spaces. AISTATS 2024. [Link]
Shyam Sundhar Ramesh, Yifan Hu, Iason Chaimalas, Viraj Mehta, Pier Giuseppe Sessa, Haitham Bou Ammar, Ilija Bogunovic. Group Robust Preference Optimization in Reward-free Reinforcement Learning with Human Feedback. [ArXiv]