(α-β order) denotes alphabetical ordering, * denotes equal contribution.
Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction [arXiv]
Yong Lin, Shange Tang, Bohan Lyu, Ziran Yang, Jui-Hui Chung, Haoyu Zhao, Lai Jiang, Yihan Geng, Jiawei Ge, Jingruo Sun, Jiayun Wu, Jiri Gesi, Ximing Lu, David Acuna, Kaiyu Yang, Hongzhou Lin, Yejin Choi, Danqi Chen, Sanjeev Arora, Chi Jin
ArXiv Preprint
LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra [arXiv]
Seth Karten, Wenzhe Li, Zihan Ding, Samuel Kleiner, Yu Bai, Chi Jin
ArXiv Preprint
Learning World Models for Interactive Video Generation [arXiv]
Taiye Chen*, Xun Hu*, Zihan Ding*, Chi Jin
ArXiv Preprint
Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving on Inequalities [arXiv]
Haoyu Zhao, Yihan Geng, Shange Tang, Yong Lin, Bohan Lyu, Hongzhou Lin, Chi Jin, Sanjeev Arora
ArXiv Preprint
Frontier LLMs Still Struggle with Simple Reasoning Tasks [arXiv]
Alan Malek, Jiawei Ge, Nevena Lazic, Chi Jin, András György, Csaba Szepesvári
ArXiv Preprint
Generative Diffusion Modeling: A Practical Handbook [arXiv]
Zihan Ding, Chi Jin
ArXiv Preprint
Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial? [arXiv]
Wenzhe Li*, Yong Lin*, Mengzhou Xia, Chi Jin
ArXiv Preprint
On Limitation of Transformer for Learning HMMs [arXiv]
Jiachen Hu, Qinghua Liu, Chi Jin
ArXiv Preprint
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving [arXiv]
Yong Lin*, Shange Tang*, Bohan Lyu, Jiayun Wu, Hongzhou Lin, Kaiyu Yang, Jia Li, Mengzhou Xia, Danqi Chen, Sanjeev Arora, Chi Jin
Conference on Language Modeling (COLM) 2025
DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization [arXiv]
Zihan Ding, Chi Jin, Difan Liu, Haitian Zheng, Krishna Kumar Singh, Qiang Zhang, Yan Kang, Zhe Lin, Yuchen Liu
International Conference on Computer Vision (ICCV) 2025
PokéChamp: an Expert-level Minimax Language Agent for Competitive Pokémon [arXiv]
Seth Karten, Andy Luu Nguyen, Chi Jin
International Conference on Machine Learning (ICML) 2025
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations [arXiv]
Kaixuan Huang, Jiacheng Guo, Zihao Li, Xiang Ji, Jiawei Ge, Wenzhe Li, Yingqing Guo, Tianle Cai, Hui Yuan, Runzhe Wang, Yue Wu, Ming Yin, Shange Tang, Yangsibo Huang, Chi Jin, Xinyun Chen, Chiyuan Zhang, Mengdi Wang
International Conference on Machine Learning (ICML) 2025
Building Math Agents with Multi-Turn Iterative Preference Learning [arXiv]
Wei Xiong, Chengshuai Shi, Jiaming Shen, Aviv Rosenberg, Zhen Qin, Daniele Calandriello, Misha Khalman, Rishabh Joshi, Bilal Piot, Mohammad Saleh, Chi Jin, Tong Zhang, Tianqi Liu
International Conference on Learning Representations (ICLR) 2025
Is RLHF More Difficult than Standard RL? [arXiv]
Yuanhao Wang, Qinghua Liu, Chi Jin
Neural Information Processing Systems (NIPS) 2023