Chi Jin - Publications

Preprints

(α-β order) denotes alphabetical ordering, * denotes equal contribution.

Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction [arXiv]

Yong Lin, Shange Tang, Bohan Lyu, Ziran Yang, Jui-Hui Chung, Haoyu Zhao, Lai Jiang, Yihan Geng, Jiawei Ge, Jingruo Sun, Jiayun Wu, Jiri Gesi, Ximing Lu, David Acuna, Kaiyu Yang, Hongzhou Lin, Yejin Choi, Danqi Chen, Sanjeev Arora, Chi Jin
ArXiv Preprint

LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra [arXiv]

Seth Karten, Wenzhe Li, Zihan Ding, Samuel Kleiner, Yu Bai, Chi Jin
ArXiv Preprint

Frontier LLMs Still Struggle with Simple Reasoning Tasks [arXiv]

Alan Malek, Jiawei Ge, Nevena Lazic, Chi Jin, András György, Csaba Szepesvári
ArXiv Preprint

Principled Out-of-Distribution Generalization via Simplicity [arXiv]

Jiawei Ge, Amanda Wang, Shange Tang, Chi Jin
ArXiv Preprint

Is Elo Rating Reliable? A Study Under Model Misspecification [arXiv]

Shange Tang, Yuanhao Wang, Chi Jin
ArXiv Preprint

Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial? [arXiv]

Wenzhe Li*, Yong Lin*, Mengzhou Xia, Chi Jin
ArXiv Preprint

Generative Diffusion Modeling: A Practical Handbook [arXiv]

Zihan Ding, Chi Jin
ArXiv Preprint

DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization [arXiv]

Zihan Ding, Chi Jin, Difan Liu, Haitian Zheng, Krishna Kumar Singh, Qiang Zhang, Yan Kang, Zhe Lin, Yuchen Liu
ArXiv Preprint

On Limitation of Transformer for Learning HMMs [arXiv]

Jiachen Hu, Qinghua Liu, Chi Jin
ArXiv Preprint

Learning a Universal Human Prior for Dexterous Manipulation from Human Preference. [arXiv]

Zihan Ding, Yuanpei Chen, Allen Z Ren, Shixiang Shane Gu, Hao Dong, Chi Jin
ArXiv Preprint

Thinking Fast and Slow: Data-Driven Adaptive DeFi Borrow-Lending Protocol. [arXiv]

Mahsa Bastankhah, Viraj Nadkarni, Xuechao Wang, Chi Jin, Sanjeev Kulkarni, Pramod Viswanath
ArXiv Preprint

Publications

(α-β order) denotes alphabetical ordering, *,+ denotes equal contribution.

Understanding outer learning rates in Local SGD [arXiv]

Ahmed Khaled, Satyen Kale, Arthur Douillard, Chi Jin, Rob Fergus, Manzil Zaheer
Neural Information Processing Systems (NeurIPS) 2025

Learning World Models for Interactive Video Generation [arXiv]

Taiye Chen*, Xun Hu*, Zihan Ding*, Chi Jin
Neural Information Processing Systems (NeurIPS) 2025

Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving on Inequalities [arXiv]

Haoyu Zhao, Yihan Geng, Shange Tang, Yong Lin, Bohan Lyu, Hongzhou Lin, Chi Jin, Sanjeev Arora
Neural Information Processing Systems (NeurIPS) 2025 Track on Datasets and Benchmarks

Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving [arXiv]

Yong Lin*, Shange Tang*, Bohan Lyu, Jiayun Wu, Hongzhou Lin, Kaiyu Yang, Jia Li, Mengzhou Xia, Danqi Chen, Sanjeev Arora, Chi Jin
Conference on Language Modeling (COLM) 2025

DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization [arXiv]

Zihan Ding, Chi Jin, Difan Liu, Haitian Zheng, Krishna Kumar Singh, Qiang Zhang, Yan Kang, Zhe Lin, Yuchen Liu
International Conference on Computer Vision (ICCV) 2025

PokéChamp: an Expert-level Minimax Language Agent for Competitive Pokémon [arXiv]

Seth Karten, Andy Luu Nguyen, Chi Jin
International Conference on Machine Learning (ICML) 2025

Securing Equal Share: A Principled Approach for Learning Multiplayer Symmetric Games [arXiv]

Jiawei Ge*, Yuanhao Wang*, Wenzhe Li, Chi Jin
International Conference on Machine Learning (ICML) 2025

MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations [arXiv]

Kaixuan Huang, Jiacheng Guo, Zihao Li, Xiang Ji, Jiawei Ge, Wenzhe Li, Yingqing Guo, Tianle Cai, Hui Yuan, Runzhe Wang, Yue Wu, Ming Yin, Shange Tang, Yangsibo Huang, Chi Jin, Xinyun Chen, Chiyuan Zhang, Mengdi Wang
International Conference on Machine Learning (ICML) 2025

Benign Overfitting in Out-of-Distribution Generalization of Linear Models [arXiv]

Shange Tang*, Jiayun Wu*, Jianqing Fan, Chi Jin
International Conference on Learning Representations (ICLR) 2025

Building Math Agents with Multi-Turn Iterative Preference Learning [arXiv]

Wei Xiong, Chengshuai Shi, Jiaming Shen, Aviv Rosenberg, Zhen Qin, Daniele Calandriello, Misha Khalman, Rishabh Joshi, Bilal Piot, Mohammad Saleh, Chi Jin, Tong Zhang, Tianqi Liu
International Conference on Learning Representations (ICLR) 2025

FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning [arXiv]

Wenzhe Li, Zihan Ding, Seth Karten, Chi Jin
International Conference on Machine Learning (ICML) 2024

Tuning-Free Stochastic Optimization [arXiv]

Ahmed Khaled, Chi Jin
International Conference on Machine Learning (ICML) 2024

Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift [arXiv]

Jiawei Ge*, Shange Tang*, Jianqing Fan, Cong Ma, Chi Jin
International Conference on Learning Representations (ICLR) 2024

Consistency Models as a Rich and Efficient Policy Class for Reinforcement Learning [arXiv]

Zihan Ding, Chi Jin
International Conference on Learning Representations (ICLR) 2024

On the Provable Advantage of Unsupervised Pretraining [arXiv]

Jiawei Ge*, Shange Tang*, Jianqing Fan, Chi Jin
International Conference on Learning Representations (ICLR) 2024

ZeroSwap: Data-driven Optimal Market Making in DeFi [arXiv]

Viraj Nadkarni, Jiachen Hu, Ranvir Rana, Chi Jin, Sanjeev Kulkarni, Pramod Viswanath
Financial Cryptography and Data Security 2024

V-Learning -- A Simple, Efficient, Decentralized Algorithm for Multiagent RL [arXiv]

(α-β order) Chi Jin, Qinghua Liu, Yuanhao Wang, Tiancheng Yu
Mathematics of Operation Research (MOR) 2023
Best Paper in ICLR 2022 workshop “Gamification and Multiagent Solutions”

Is RLHF More Difficult than Standard RL? [arXiv]

Yuanhao Wang, Qinghua Liu, Chi Jin
Neural Information Processing Systems (NIPS) 2023

DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method. [arXiv]

Ahmed Khaled, Konstantin Mishchenko, Chi Jin
Neural Information Processing Systems (NIPS) 2023

Context-lumpable Stochastic Bandits. [arXiv]

Chung-Wei Lee, Qinghua Liu, Yasin Abbasi-Yadkori, Chi Jin, Tor Lattimore, Csaba Szepesv´ari
Neural Information Processing Systems (NIPS) 2023

Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RL. [arXiv]

Qinghua Liu, Gell´ert Weisz, Andras Gyorgy, Chi Jin, Csaba Szepesv´ari
Neural Information Processing Systems (NIPS) 2023

Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation [arXiv]

Yuanhao Wang*, Qinghua Liu*, Yu Bai+, Chi Jin+
Conference of Learning Theory (COLT) 2023

Efficient displacement convex optimization with particle gradient descent [arXiv]

Hadi Daneshmand, Jason D. Lee, Chi Jin
International Conference on Machine Learning (ICML) 2023

Optimistic MLE -- A Generic Model-based Algorithm for Partially Observable Sequential Decision Making [arXiv]

Qinghua Liu, Praneeth Netrapalli, Csaba Szepesvari, Chi Jin
Symposium on Theory of Computing (STOC) 2023

Learning Rationalizable Equilibria in Multiplayer Games [arXiv]

Yuanhao Wang*, Dingwen Kong*, Yu Bai, Chi Jin
International Conference on Learning Representations (ICLR) 2023

Faster Federated Optimization under Second-order Similarity [arXiv]

Ahmed Khaled, Chi Jin
International Conference on Learning Representations (ICLR) 2023

Representation Learning for Low-rank General-sum Markov Games [arXiv]

Chengzhuo Ni, Yuda Song, Xuezhou Zhang, Zihan Ding, Chi Jin, Mengdi Wang
International Conference on Learning Representations (ICLR) 2023.

Provable Sim-to-real Transfer in Continuous Domain with Partial Observations [arXiv]

Jiachen Hu, Han Zhong, Chi Jin, Liwei Wang
International Conference on Learning Representations (ICLR) 2023.

Sample-Efficient Reinforcement Learning of Partially Observable Markov Games [arXiv]

Qinghua Liu, Csaba Szepesvári, Chi Jin
Neural Information Processing Systems (NIPS) 2022.

Efficient Φ-Regret Minimization in Extensive-Form Games via Online Mirror Descent [arXiv]

(α-β order) Yu Bai, Chi Jin, Song Mei, Ziang Song, Tiancheng Yu
Neural Information Processing Systems (NIPS) 2022.

When Is Partially Observable Reinforcement Learning Not Scary? [arXiv]

Qinghua Liu, Alan Chung, Csaba Szepesvári, Chi Jin
Conference of Learning Theory (COLT) 2022.

Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits [arXiv]

Qinghua Liu, Yuanhao Wang, Chi Jin
International Conference on Machine Learning (ICML) 2022.

Near-Optimal Learning of Extensive-Form Games with Imperfect Information. [arXiv]

(α-β order) Yu Bai, Chi Jin, Song Mei, Tiancheng Yu
International Conference on Machine Learning (ICML) 2022.

Provable Reinforcement Learning with a Short-Term Memory. [arXiv]

(α-β order) Yonathan Efroni, Chi Jin, Akshay Krishnamurthy, Sobhan Miryoosefi
International Conference on Machine Learning (ICML) 2022.

A Simple Reward-free Approach to Constrained Reinforcement Learning [arXiv]

Sobhan Miryoosefi, Chi Jin
International Conference on Machine Learning (ICML) 2022.

The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces [arXiv]

(α-β order) Chi Jin, Qinghua Liu, Tiancheng Yu
International Conference on Machine Learning (ICML) 2022.

Understanding Domain Randomization for Sim-to-real Transfer [arXiv]

Xiaoyu Chen, Jiachen Hu, Chi Jin, Lihong Li, Liwei Wang
International Conference on Learning Representations (ICLR) 2022

Minimax Optimization with Smooth Algorithmic Adversaries [arXiv]

(α-β order) Tanner Fiez, Chi Jin, Praneeth Netrapalli, Lillian J. Ratliff
International Conference on Learning Representations (ICLR) 2022

Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms [arXiv]

(α-β order) Chi Jin, Qinghua Liu, Sobhan Miryoosefi
Neural Information Processing Systems (NIPS) 2021.

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games [arXiv]

Yu Bai, Chi Jin, Huan Wang, Caiming Xiong
Neural Information Processing Systems (NIPS) 2021.

Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning [arXiv]

(α-β order) Yaqi Duan, Chi Jin, Zhiyuan Li
International Conference on Machine Learning (ICML) 2021.

A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network [arXiv]

Mo Zhou, Rong Ge, Chi Jin
Conference of Learning Theory (COLT) 2021.

Near-optimal Representation Learning for Linear Bandits and Linear RL [arXiv]

Jiachen Hu, Xiaoyu Chen, Chi Jin, Lihong Li, Liwei Wang
International Conference on Machine Learning (ICML) 2021.

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play [arXiv]

Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin
International Conference on Machine Learning (ICML) 2021.

Provable Meta-Learning of Linear Representations [arXiv]

Nilesh Tripuraneni, Chi Jin, Michael I. Jordan
International Conference on Machine Learning (ICML) 2021.

On Nonconvex Optimization for Machine Learning: Gradients, Stochasticity, and Saddle Points [arXiv]

Chi Jin, Praneeth Netrapalli, Rong Ge, Sham M. Kakade, Michael I. Jordan
Journal of the ACM, 2021.

Sample-Efficient Reinforcement Learning of Undercomplete POMDPs [arXiv]

(α-β order) Chi Jin, Sham M. Kakade, Akshay Krishnamurthy, Qinghua Liu
Neural Information Processing Systems (NIPS) 2020.

Near-Optimal Reinforcement Learning with Self-Play [arXiv]

(α-β order) Yu Bai, Chi Jin, Tiancheng Yu
Neural Information Processing Systems (NIPS) 2020.

On the Theory of Transfer Learning: The Importance of Task Diversity [arXiv]

Nilesh Tripuraneni, Michael I. Jordan, Chi Jin
Neural Information Processing Systems (NIPS) 2020.

On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces [arXiv]

Zhuoran Yang, Chi Jin, Zhaoran Wang, Mengdi Wang, Michael I. Jordan
Neural Information Processing Systems (NIPS) 2020.

Provable Self-Play Algorithms for Competitive Reinforcement Learning [arXiv]

(α-β order) Yu Bai, Chi Jin
International Conference on Machine Learning (ICML) 2020.

Reward-Free Exploration for Reinforcement Learning [arXiv]

(α-β order) Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu
International Conference on Machine Learning (ICML) 2020.

Near-Optimal Algorithms for Minimax Optimization [arXiv]

Tianyi Lin, Chi Jin, Michael. I. Jordan
Conference of Learning Theory (COLT) 2020

Learning Adversarial MDPs with Bandit Feedback and Unknown Transition [arXiv]

(α-β order) Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, Tiancheng Yu
International Conference on Machine Learning (ICML) 2020.

Provably Efficient Exploration in Policy Optimization [arXiv]

Qi Cai, Zhuoran Yang, Chi Jin, Zhaoran Wang
International Conference on Machine Learning (ICML) 2020.

Provably Efficient Reinforcement Learning with Linear Function Approximation [arXiv]

Chi Jin, Zhuoran Yang, Zhaoran Wang, Michael I. Jordan
Conference of Learning Theory (COLT) 2020

What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? [arXiv]

Chi Jin, Praneeth Netrapalli, Michael I. Jordan
International Conference on Machine Learning (ICML) 2020.

On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems [arXiv]

Tianyi Lin, Chi Jin, Michael I. Jordan
International Conference on Machine Learning (ICML) 2020.

Sampling Can Be Faster Than Optimization [arXiv]

Yi-An Ma, Yuansi Chen, Chi Jin, Nicolas Flammarion, Michael I. Jordan
Proceedings of the National Academy of Sciences (PNAS) 2019.

Is Q-learning Provably Efficient? [arXiv]

Chi Jin*, Zeyuan Allen-Zhu*, Sebastien Bubeck, Michael I. Jordan
Neural Information Processing Systems (NIPS) 2018. Best Paper in ICML 2018 workshop "Exploration in RL"

On the Local Minima of the Empirical Risk [arXiv]

Chi Jin*, Lydia T. Liu*, Rong Ge, Michael I. Jordan
(Spotlight) Neural Information Processing Systems (NIPS) 2018.

Stochastic Cubic Regularization for Fast Nonconvex Optimization [arXiv]

Nilesh Tripuraneni*, Mitchell Stern*, Chi Jin, Jeffrey Regier, Michael I. Jordan
(Oral) Neural Information Processing Systems (NIPS) 2018.

Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent [arXiv]

Chi Jin, Praneeth Netrapalli, Michael I. Jordan
Conference of Learning Theory (COLT) 2018

Gradient Descent Can Take Exponential Time to Escape Saddle Points [arXiv]

Simon S. Du, Chi Jin, Jason D. Lee, Michael I. Jordan, Barnabas Poczos, Aarti Singh
Neural Information Processing Systems (NIPS) 2017.

How to Escape Saddle Points Efficiently [arXiv] [blog]

Chi Jin, Rong Ge, Praneeth Netrapalli, Sham M. Kakade, Michael I. Jordan
International Conference on Machine Learning (ICML) 2017.

No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis [arXiv]

(α-β order) Rong Ge, Chi Jin, Yi Zheng
International Conference on Machine Learning (ICML) 2017.

Global Convergence of Non-Convex Gradient Descent for Computing Matrix Squareroot [arXiv]

(α-β order) Prateek Jain, Chi Jin, Sham M. Kakade, Praneeth Netrapalli
Artificial Intelligence and Statistics Conference (AISTATS) 2017.

Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences [arXiv]

Chi Jin, Yuchen Zhang, Sivaraman Balakrishnan, Martin J. Wainwright, Michael I. Jordan
Neural Information Processing Systems (NIPS) 2016.

Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent [arXiv]

(α-β order) Chi Jin, Sham M. Kakade, Praneeth Netrapalli
Neural Information Processing Systems (NIPS) 2016.

Streaming PCA: Matching Matrix Bernstein and Near-Optimal Finite Sample Guarantees for Oja's Algorithm [arXiv]

(α-β order) Prateek Jain, Chi Jin, Sham M. Kakade, Praneeth Netrapalli, Aaron Sidford
Conference of Learning Theory (COLT) 2016.

Efficient Algorithms for Large-scale Generalized Eigenvector Computation and Canonical Correlation Analysis [arXiv]

(α-β order) Rong Ge, Chi Jin, Sham M. Kakade, Praneeth Netrapalli, Aaron Sidford
International Conference on Machine Learning (ICML) 2016.

Faster Eigenvector Computation via Shift-and-Invert Preconditioning [arXiv]

(α-β order) Dan Garber, Elad Hazan, Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford
International Conference on Machine Learning (ICML) 2016.

Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition [arXiv]

(α-β order) Rong Ge, Furong Huang, Chi Jin, Yang Yuan
Conference of Learning Theory (COLT) 2015.

Differentially Private Data Releasing for Smooth Queries [paper]

Ziteng Wang, Chi Jin, Kai Fan, Jiaqi Zhang, Junliang Huang, Yiqiao Zhong, Liwei Wang
Journal of Machine Learning (JMLR) 2015.

Dimensionality Dependent PAC-Bayes Margin Bound [paper]

Chi Jin, Liwei Wang
Neural Information Processing Systems (NIPS) 2012.

Technical Reports

A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari Games [arXiv]

Zihan Ding, Dijia Su, Qinghua Liu, Chi Jin
ArXiv Preprint.

Stability and Convergence Trade-off of Iterative Optimization Algorithms [arXiv]

Yuansi Chen, Chi Jin, Bin Yu
ArXiv Preprint.

A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm [arXiv]

Chi Jin, Praneeth Netrapalli, Rong Ge, Sham M. Kakade, Michael I. Jordan
ArXiv Preprint.