Zaiwei Chen

Working Papers

From set convergence to pointwise convergence: Finite-time guarantees for average-reward Q-learning with adaptive stepsizes

Zaiwei Chen and Phalguni Nanda

Under review at Mathematics of Operations Research | 2026

Arxiv

Achieving ϵ-2 dependence for average-reward Q-learning with a new contraction principle

Zijun Chen, Zaiwei Chen, Nian Si, Shengbo Wang

Under review | 2026

Arxiv

A non-asymptotic theory of seminorm Lyapunov stability: From deterministic to stochastic iterative algorithms

Zaiwei Chen, Sheng Zhang, Zhe Zhang, Shaan Ul Haque, and Siva Theja Maguluri

Minor revision at Mathematics of Operations Research | 2026

Arxiv

Journal Publications

A minimal-assumption analysis of Q-learning with time-varying policies (Best Paper Award Finalist)

Phalguni Nanda and Zaiwei Chen

SIGMETRICS | 2026

Arxiv

Concentration of contractive stochastic approximation: Additive and multiplicative noise

Zaiwei Chen, Siva Theja Maguluri, and Martin Zubeldia

The Annals of Applied Probability | 2025

Paper

An approximate policy iteration viewpoint of actor–critic algorithms

Zaiwei Chen and Siva Theja Maguluri

Automatica | 2025

Paper

A Lyapunov theory for finite-sample guarantees of Markovian stochastic approximation

Zaiwei Chen, Siva Theja Maguluri, Karthikeyan Shanmugam, and Sanjay Shakkottai

Operations Research | 2023

Paper

Target network and truncation overcome the deadly triad in Q-learning

Zaiwei Chen, John-Paul Clarke, and Siva Theja Maguluri

SIAM Journal on Mathematics of Data Science | 2023

Paper

Global convergence of localized policy iteration in networked multi-agent reinforcement learning

Yizhou Zhang, Guannan Qu, Pan Xu, Yiheng Lin, Zaiwei Chen, and Adam Wierman

Proceedings of the ACM on Measurement and Analysis of Computing Systems | 2023

Paper

Stationary behavior of constant stepsize SGD-type algorithms: An asymptotic characterization

Zaiwei Chen*, Shancong Mou*, and Siva Theja Maguluri

Proceedings of the ACM on Measurement and Analysis of Computing Systems | 2022

Paper

Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning

Zaiwei Chen, Sheng Zhang, Thinh T. Doan, John-Paul Clarke, and Siva Theja Maguluri

Automatica | 2022

Paper

Finite-sample analysis of off-policy natural actor–critic with linear function approximation

Zaiwei Chen*, Sajad Khodadadian*, and Siva Theja Maguluri

IEEE Control Systems Letters | 2022

Paper

Nested vehicle routing problem: Optimizing drone-truck surveillance operations

Fanruiqi Zeng, Zaiwei Chen, John-Paul Clarke, and David Goldsman

Transportation Research Part C | 2022

Paper

Conference Proceedings

Natural hypergradient descent: Algorithm design, convergence analysis, and parallel implementation

Deyi Kong, Zaiwei Chen, Shuzhong Zhang, Shancong Mou

ICML | 2026

Paper

Bridging the gap between average and discounted TD-learning

Haoxing Tian, Zaiwei Chen, Ioannis Paschalidis, Alex Olshevsky

ICML | 2026

Paper

Non-asymptotic guarantees for average-reward Q-learning with adaptive stepsizes

Zaiwei Chen

NeurIPS | 2025

Paper

Maximizing the value of predictions in control: Accuracy is not enough

Yiheng Lin, Christopher Yeh, Zaiwei Chen, and Adam Wierman

NeurIPS | 2025

Paper

Reinforcement learning with imperfect transition predictions: A Bellman-Jensen approach (spotlight)

Chenbei Lu, Zaiwei Chen, Tongxin Li, Chenye Wu, Adam Wierman

NeurIPS | 2025

Paper

Overcoming the curse of dimensionality in reinforcement learning through approximate factorization

Chenbei Lu, Laixi Shi, Zaiwei Chen, Chenye Wu, and Adam Wierman

ICML | 2025

Paper

Approximate global convergence of independent learning in multi-agent systems

Ruiyang Jin, Zaiwei Chen, Yiheng Lin, Jie Song, and Adam Wierman

AISTATS | 2025

Paper

Last-iterate convergence for generalized Frank-Wolfe in monotone variational inequalities

Zaiwei Chen and Eric Mazumdar

NeurIPS | 2024

Paper

Two-timescale Q-learning with function approximation in zero-sum stochastic games

Zaiwei Chen, Kaiqing Zhang, Eric Mazumdar, Asuman Ozdaglar, and Adam Wierman

The ACM Conference on Economics and Computation | 2024

Paper

Convergence rates for localized actor-critic in networked Markov potential games

Zhaoyi Zhou, Zaiwei Chen, Yiheng Lin, and Adam Wierman

UAI | 2023

Paper

A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games

Zaiwei Chen, Kaiqing Zhang, Eric Mazumdar, Asuman Ozdaglar, and Adam Wierman

NeurIPS | 2023

Paper

Sample complexity of policy-based methods under off-policy sampling and linear function approximation

Zaiwei Chen and Siva Theja Maguluri

AISTATS | 2022

Paper

Finite-sample analysis of off-policy TD-learning via generalized Bellman operators

Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, and Karthikeyan Shanmugam

NeurIPS | 2021

Paper

Finite-sample analysis of off-policy natural actor-critic algorithm

Sajad Khodadadian*, Zaiwei Chen*, and Siva Theja Maguluri

ICML | 2021

Paper

Finite-sample analysis of contractive stochastic approximation using smooth convex envelopes

Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, and Karthikeyan Shanmugam

NeurIPS | 2020

Paper

Google Sites

Report abuse