Selected Manuscripts
Off-policy Evaluation in Doubly Inhomogeneous Environments With Bian, Z., Shi, C. and Wang, L.
Sequential knockoffs for variable selection in reinforcement learning With Ma, T., Cai, H., Shi, C. and Laber, E.
STEEL: Singularity-aware Reinforcement Learning With Chen, X., and Wan, R.
Blessing from Human-AI Interaction: Super Reinforcement Learning in Confounded Environments With Wang, J. and Shi, C.
Change Point Detection for High-dimensional Linear Models: A General Tail-adaptive Approach With Liu, B., Zhang, X., and Liu, Y.
Offline Personalized Pricing with Censored Demand With Tang, J., Fang, X., and Shi, C.
Robust Batch Policy Learning in Markov Decision Processes. With Liao, P.
Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information. With Fu, Z., Yang, Z., Wang, Z., Wang, L.
Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes. With Fu, Z, Wang, Z, Yang, Z, Xu, Y, and Kosorok, MR
Personalized Pricing with Invalid Instrumental Variables: Identification, Estimation, and Policy Learning With Rui Miao, Cong Shi and Lin Lin
Publications
Wang, J., Qi, Z., Wong, R.K W. A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric Models. International Conference on Machine Learning (ICML) 2024.
Hong, M.#, Qi, Z., Xu, Y. Model-based Reinforcement Learning for Confounded POMDPs. International Conference on Machine Learning (ICML) 2024.
Zhu J., Wan, R., Qi, Z., Luo, S., Shi, C. Robust Offline Policy Evaluation and Optimization with Heavy-Tailed Rewards, AISTATS 2024.
Hong, M.#, Qi, Z., Xu, Y. A Policy Gradient Method for Confounded POMDPs. International Conference on Learning Representations (ICLR) 2024. [JSM Student Paper Award from the ASA Section on Nonparametric Statistics]
Shi, C.*, Qi, Z.*, Wang, J., Zhou, F. Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization. Journal of the American Statistical Association To appear. [code]
Wang, J.#, Qi, Z., Wong, R. K W. Projected State-Action Balancing Weights for Offline Reinforcement Learning. Annals of Statistics To appear.
Yang, H.#, Qi, Z., Cui, Y., Chen, P. (2023) Pessimistic model selection for deep reinforcement learning. Conference on Uncertainty in Artificial Intelligence.
Dong, J.#, Mo, W., Qi, Z., Shi, C., Fang, X., and Tarokh, V. PASTA: Pessimistic Assortment Optimization International Conference on Machine Learning (ICML 2023), Accepted.
Zhou, Y.#, Qi, Z., Shi, C. and Li, L. (2023). Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning Approach, AISTATS 2023. [code]
Qi, Z.*, Miao, R.*#, Zhang, X. Proximal Learning for Individualized Treatment Regimes Under Unmeasured Confounding. Journal of the American Statistical Association To appear. [code]
Liao, P.,*, Qi, Z.,* Wan, R., Klasnja, P., and Murphy, S., Batch Policy Learning in Average Reward Markov Decision Processes. Accepted at Annals of Statistics.
Miao, R.#, Qi, Z., Zhang, X. Off-Policy Evaluation for Episodic Partially Observable Markov Decision Processes under Non-Parametric Models. Accepted at Advances in Neural Information Processing Systems (NeurIPS), 2022 [code]
Tan, X.#, Qi, Z., Seymour, C., Tang, L. RISE: Robust Individualized Decision Learning with Sensitive Variables. Accepted at Advances in Neural Information Processing Systems (NeurIPS), 2022 [ENAR 2023 Distinguished Student Paper Award][code]
Chen, X., Qi, Z., On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation. International Conference on Machine Learning (ICML 2022), Baltimore, MD.
Qi, Z., Pang, J.-S., Liu, Y. On Robustness of Individualized Decision Rules. Journal of the American Statistical Association To appear.
Qi, Z., Cui, Y., Liu, Y., Pang, J.-S. Asymptotic Properties of Stationary Solutions of Coupled Nonconvex Nonsmooth Empirical Risk Minimization. Accepted at Mathematics of Operations Research.
Mo, W.*, Qi, Z.*, Liu, Y. Rejoinder to "Learning Optimal Distributionally Robust Individualized Treatment Rules". Journal of the American Statistical Association To appear.
Mo, W.*, Qi, Z.*, Liu, Y. Learning Optimal Distributionally Robust Individualized Treatment Rules.(with discussion) Journal of the American Statistical Association To appear.
Qi, Z., Cui, Y., Liu, Y., Pang, J.-S. Estimation of Individualized Decision Rules Based on An Optimized Covariate-dependent Equivalent of Random Outcomes. To appear at SIAM Journal on Optimization.
Zheng. J., Qi, Z., Tan, Y., and Dou, Y. How Mega is the Mega? Measuring the Spillover Effect of WeChat Using Graphical Models. To appear at Information Systems Research.
Qi, Z., Liu, D., Fu, H., Liu, Y. (2018+). Multi-armed Angle-based Direct learning for Estimating Optimal Individualized Treatment Rules with Various Outcomes. Journal of the American Statistical Association.
Qi, Z., Liu, Y. (2018+). D-learning to Estimate Optimal Individualized Treatment Rules. Electronic Journal of Statistics.
Qi, Z., Liu, Y. (2018+). Convex Bidirectional Large Margin Classifier. Technometrics.
Liang, S., Qi, Z., Qu, S., Zhu, J., Chiu, A. S., Jia, X., and Xu, M. (2016). Scaling of global input-output networks. Physica A: Statistical Mechanics and its Applications, 452, 311-319.
* These authors contributed equally to the manuscript.
# Ph.D. students by the time of submission.