From set convergence to pointwise convergence: Finite-time guarantees for average-reward Q-learning with adaptive stepsizes
Zaiwei Chen and Phalguni Nanda
Under review at Mathematics of Operations Research | 2026
Achieving ϵ-2 dependence for average-reward Q-learning with a new contraction principle
Zijun Chen, Zaiwei Chen, Nian Si, Shengbo Wang
Under review | 2026
A non-asymptotic theory of seminorm Lyapunov stability: From deterministic to stochastic iterative algorithms
Zaiwei Chen, Sheng Zhang, Zhe Zhang, Shaan Ul Haque, and Siva Theja Maguluri
Minor revision at Mathematics of Operations Research | 2026
A minimal-assumption analysis of Q-learning with time-varying policies (Best Paper Award Finalist)
Phalguni Nanda and Zaiwei Chen
SIGMETRICS | 2026
Concentration of contractive stochastic approximation: Additive and multiplicative noise
Zaiwei Chen, Siva Theja Maguluri, and Martin Zubeldia
The Annals of Applied Probability | 2025
An approximate policy iteration viewpoint of actor–critic algorithms
Zaiwei Chen and Siva Theja Maguluri
Automatica | 2025
A Lyapunov theory for finite-sample guarantees of Markovian stochastic approximation
Zaiwei Chen, Siva Theja Maguluri, Karthikeyan Shanmugam, and Sanjay Shakkottai
Operations Research | 2023
Target network and truncation overcome the deadly triad in Q-learning
Zaiwei Chen, John-Paul Clarke, and Siva Theja Maguluri
SIAM Journal on Mathematics of Data Science | 2023
Global convergence of localized policy iteration in networked multi-agent reinforcement learning
Yizhou Zhang, Guannan Qu, Pan Xu, Yiheng Lin, Zaiwei Chen, and Adam Wierman
Proceedings of the ACM on Measurement and Analysis of Computing Systems | 2023
Stationary behavior of constant stepsize SGD-type algorithms: An asymptotic characterization
Zaiwei Chen*, Shancong Mou*, and Siva Theja Maguluri
Proceedings of the ACM on Measurement and Analysis of Computing Systems | 2022
Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning
Zaiwei Chen, Sheng Zhang, Thinh T. Doan, John-Paul Clarke, and Siva Theja Maguluri
Automatica | 2022
Finite-sample analysis of off-policy natural actor–critic with linear function approximation
Zaiwei Chen*, Sajad Khodadadian*, and Siva Theja Maguluri
IEEE Control Systems Letters | 2022
Nested vehicle routing problem: Optimizing drone-truck surveillance operations
Fanruiqi Zeng, Zaiwei Chen, John-Paul Clarke, and David Goldsman
Transportation Research Part C | 2022
Natural hypergradient descent: Algorithm design, convergence analysis, and parallel implementation
Deyi Kong, Zaiwei Chen, Shuzhong Zhang, Shancong Mou
ICML | 2026
Bridging the gap between average and discounted TD-learning
Haoxing Tian, Zaiwei Chen, Ioannis Paschalidis, Alex Olshevsky
ICML | 2026
Non-asymptotic guarantees for average-reward Q-learning with adaptive stepsizes
Zaiwei Chen
NeurIPS | 2025
Maximizing the value of predictions in control: Accuracy is not enough
Yiheng Lin, Christopher Yeh, Zaiwei Chen, and Adam Wierman
NeurIPS | 2025
Reinforcement learning with imperfect transition predictions: A Bellman-Jensen approach (spotlight)
Chenbei Lu, Zaiwei Chen, Tongxin Li, Chenye Wu, Adam Wierman
NeurIPS | 2025
Overcoming the curse of dimensionality in reinforcement learning through approximate factorization
Chenbei Lu, Laixi Shi, Zaiwei Chen, Chenye Wu, and Adam Wierman
ICML | 2025
Approximate global convergence of independent learning in multi-agent systems
Ruiyang Jin, Zaiwei Chen, Yiheng Lin, Jie Song, and Adam Wierman
AISTATS | 2025
Last-iterate convergence for generalized Frank-Wolfe in monotone variational inequalities
Zaiwei Chen and Eric Mazumdar
NeurIPS | 2024
Two-timescale Q-learning with function approximation in zero-sum stochastic games
Zaiwei Chen, Kaiqing Zhang, Eric Mazumdar, Asuman Ozdaglar, and Adam Wierman
The ACM Conference on Economics and Computation | 2024
Convergence rates for localized actor-critic in networked Markov potential games
Zhaoyi Zhou, Zaiwei Chen, Yiheng Lin, and Adam Wierman
UAI | 2023
A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games
Zaiwei Chen, Kaiqing Zhang, Eric Mazumdar, Asuman Ozdaglar, and Adam Wierman
NeurIPS | 2023
Sample complexity of policy-based methods under off-policy sampling and linear function approximation
Zaiwei Chen and Siva Theja Maguluri
AISTATS | 2022
Finite-sample analysis of off-policy TD-learning via generalized Bellman operators
Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, and Karthikeyan Shanmugam
NeurIPS | 2021
Finite-sample analysis of off-policy natural actor-critic algorithm
Sajad Khodadadian*, Zaiwei Chen*, and Siva Theja Maguluri
ICML | 2021
Finite-sample analysis of contractive stochastic approximation using smooth convex envelopes
Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, and Karthikeyan Shanmugam
NeurIPS | 2020