Kuang, Q., Wang, J., Zhou, F., & Qi, Z.^ (2025). Breaking the Order Barrier: Off-Policy Evaluation for Confounded POMDPs. Advances in Neural Information Processing Systems (NeurIPS).
Hong, S., Wang, J., Qi, Z., & Wong, R. K. W. (2025). A Principled Path to Fitted Distributional Evaluation. Advances in Neural Information Processing Systems (NeurIPS). (Spotlight)
Bian, Z., Shi, C., Qi, Z., & Wang, L. (2025). Off-policy Evaluation in Doubly Inhomogeneous Environments. Journal of the American Statistical Association, 120(550), 1102–1114.
Tang, J., Qi, Z., Fang, E., & Shi, C. (2025). Offline Feature-Based Pricing under Censored Demand: A Causal Inference Approach. Manufacturing & Service Operations Management, 27(2), 535–553.
Hong, S., Qi, Z., & Wong, R. K. W. (2025). Distributional Off-policy Evaluation with Bellman Residual Minimization. International Conference on Artificial Intelligence and Statistics (AISTATS).
Qi, Z., Bai, C., Wang, Z., & Wang, L. (2025). Distributional Off-policy Evaluation in Reinforcement Learning. Journal of the American Statistical Association. (Articles in Advance).
Fu, Z., Qi, Z.^, Yang, Z., Wang, Z., & Wang, L. (2025). Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information. Management Science. (Articles in Advance).
Qi, Z.*^, Miao, R.#*, & Zhang, X. (2024). Proximal Learning for Individualized Treatment Regimes Under Unmeasured Confounding. Journal of the American Statistical Association, 119(546), 915–928.
Shi, C.*, Qi, Z.*^, Wang, J., & Zhou, F.^ (2024). Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization. Journal of the American Statistical Association, 119(546), 1147–1160.
Yu, S., Fang, S., Peng, R., Qi, Z., Zhou, F., & Shi, C. (2024). Two-way Deconfounder for Off-policy Evaluation under Unmeasured Confounding. Advances in Neural Information Processing Systems (NeurIPS).
Liu, B., Qi, Z., Zhang, X., & Liu, Y. (2024). Change Point Detection for High-dimensional Linear Models: A General Tail-adaptive Approach. Statistica Sinica. (Accepted).
Wang, J., Qi, Z.^, & Wong, R. K. W. (2024). A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric Models. International Conference on Machine Learning (ICML).
Hong, M.#, Qi, Z.^, & Xu, Y. (2024). Model-based Reinforcement Learning for Confounded POMDPs. International Conference on Machine Learning (ICML).
Hong, M.#, Qi, Z., & Xu, Y. (2024). A Policy Gradient Method for Confounded POMDPs. International Conference on Learning Representations (ICLR).
Zhu, J., Wan, R., Qi, Z., Luo, S., & Shi, C. (2024). Robust Offline Policy Evaluation and Optimization with Heavy-Tailed Rewards. International Conference on Artificial Intelligence and Statistics (AISTATS).
Wang, J.#, Qi, Z.^, & Wong, R. K. W. (2023). Projected State-Action Balancing Weights for Offline Reinforcement Learning. The Annals of Statistics, 51(4), 1639–1665.
Qi, Z., Pang, J.-S., & Liu, Y. (2023). On Robustness of Individualized Decision Rules. Journal of the American Statistical Association, 118(543), 2143–2157.
Yang, H.#, Qi, Z.^, Cui, Y., & Chen, P. (2023). Pessimistic Model Selection for Deep Reinforcement Learning. Conference on Uncertainty in Artificial Intelligence (UAI).
Dong, J.#, Mo, W., Qi, Z.^, Shi, C., Fang, X., & Tarokh, V. (2023). PASTA: Pessimistic Assortment Optimization. International Conference on Machine Learning (ICML).
Zhou, Y.#, Qi, Z., Shi, C., & Li, L. (2023). Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning Approach. International Conference on Artificial Intelligence and Statistics (AISTATS).
Liao, P.*, Qi, Z.*^, Wan, R., Klasnja, P., & Murphy, S. (2022). Batch Policy Learning in Average Reward Markov Decision Processes. The Annals of Statistics, 50(6), 3364–3387.
Qi, Z., Cui, Y., Liu, Y., & Pang, J.-S. (2022). Asymptotic Properties of Stationary Solutions of Coupled Nonconvex Nonsmooth Empirical Risk Minimization. Mathematics of Operations Research, 47(3), 2034–2064.
Miao, R.#, Qi, Z.^, & Zhang, X. (2022). Off-Policy Evaluation for Episodic Partially Observable Markov Decision Processes under Non-Parametric Models. Advances in Neural Information Processing Systems (NeurIPS).
Tan, X.#, Qi, Z., Seymour, C., & Tang, L. (2022). RISE: Robust Individualized Decision Learning with Sensitive Variables. Advances in Neural Information Processing Systems (NeurIPS).
Chen, X., & Qi, Z.^ (2022). On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation. International Conference on Machine Learning (ICML).
Qi, Z., Cui, Y., Liu, Y., & Pang, J.-S. (2021). Estimation of Individualized Decision Rules Based on An Optimized Covariate-dependent Equivalent of Random Outcomes. SIAM Journal on Optimization, 31(4), 3119–3148.
Mo, W., Qi, Z., & Liu, Y. (2021). Learning Optimal Distributionally Robust Individualized Treatment Rules. Journal of the American Statistical Association, 116(534), 659–674.
Mo, W., Qi, Z., & Liu, Y. (2021). Rejoinder to "Learning Optimal Distributionally Robust Individualized Treatment Rules". Journal of the American Statistical Association, 116(534), 685–689.
Qi, Z., Liu, D., Fu, H., & Liu, Y. (2020). Multi-armed Angle-based Direct Learning for Estimating Optimal Individualized Treatment Rules with Various Outcomes. Journal of the American Statistical Association, 115(530), 678–691.
Zheng, J., Qi, Z., Tan, Y., & Dou, Y. (2019). How Mega is the Mega? Measuring the Spillover Effect of WeChat Using Graphical Models. Information Systems Research, 30(4), 1343–1362.
Qi, Z., & Liu, Y. (2019). Convex Bidirectional Large Margin Classifier. Technometrics, 61(2), 176–186.
Qi, Z., & Liu, Y. (2018). D-learning to Estimate Optimal Individualized Treatment Rules. Electronic Journal of Statistics, 12(2), 3601–3638.
Liang, S., Qi, Z., Qu, S., Zhu, J., Chiu, A. S., Jia, X., & Xu, M. (2016). Scaling of Global Input-output Networks. Physica A: Statistical Mechanics and its Applications, 452, 311–319.