Search this site
Embedded Files
Zhengling Qi
  • Home
  • Research
  • Teaching & Presentations
Zhengling Qi
  • Home
  • Research
  • Teaching & Presentations
  • More
    • Home
    • Research
    • Teaching & Presentations

Selected Manuscripts

  • Robust Batch Policy Learning in Markov Decision Processes. With Liao, P.

  • Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes. with Fu, Z, Wang, Z, Yang, Z, Xu, Y, and Kosorok, MR

  • Personalized Pricing with Invalid Instrumental Variables: Identification, Estimation, and Policy Learning with Rui Miao, Cong Shi and Lin Lin

  • A New Estimator for Encouragement Design in Field Experiments When the Exclusion Restriction Is Violated with Guangying Chen, Cheng Lu, Tat Y. Chan, Dennis J. Zhang and Industry Collaborators.

  • A Tale of Two Cities: Pessimism and Opportunism in Offline Dynamic Pricing with Zeyu Bian, Cong Shi and Lan Wang

  • Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand with Korel Gundem

  • Boosting In-Context Learning in LLMs Through the Lens of Classical Supervised Learning with Korel Gundem, Juncheng Dong, Dennis, Zhang and Vahid Tarokh.

  • InSPO: Unlocking Intrinsic Self-Reflection for LLM Preference Optimization with Yu Li and Tian Lan.

  • Beyond Demand Estimation: Consumer Surplus Evaluation via Cumulative Propensity Weights with Zeyu Bian, Max Biggs and Ruijiang Gao.




Publications

Accepted

  • Sequential knockoffs for variable selection in reinforcement learning with Ma, T., Cai, H., Shi, C. and Laber, E. by JASA

  • Blessing from Human-AI Interaction: Super Reinforcement Learning in Confounded Environments with Wang, J. and  Shi, C. by JASA

  • Reinforcement Learning with Continuous Actions Under Unmeasured Confounding with Li, Y., Han, E., Hu, Y., Zhou, W., Cui, Y. and Zhu, R. by JASA

2025

  • Kuang, Q., Wang, J., Zhou, F., & Qi, Z.^ (2025). Breaking the Order Barrier: Off-Policy Evaluation for Confounded POMDPs. Advances in Neural Information Processing Systems (NeurIPS).

  • Hong, S., Wang, J., Qi, Z., & Wong, R. K. W. (2025). A Principled Path to Fitted Distributional Evaluation. Advances in Neural Information Processing Systems (NeurIPS). (Spotlight)

  • Bian, Z., Shi, C., Qi, Z., & Wang, L. (2025). Off-policy Evaluation in Doubly Inhomogeneous Environments. Journal of the American Statistical Association, 120(550), 1102–1114.

  • Tang, J., Qi, Z., Fang, E., & Shi, C. (2025). Offline Feature-Based Pricing under Censored Demand: A Causal Inference Approach. Manufacturing & Service Operations Management, 27(2), 535–553.

  • Hong, S., Qi, Z., & Wong, R. K. W. (2025). Distributional Off-policy Evaluation with Bellman Residual Minimization. International Conference on Artificial Intelligence and Statistics (AISTATS).

  • Qi, Z., Bai, C., Wang, Z., & Wang, L. (2025). Distributional Off-policy Evaluation in Reinforcement Learning. Journal of the American Statistical Association. (Articles in Advance).

  • Fu, Z., Qi, Z.^, Yang, Z., Wang, Z., & Wang, L. (2025). Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information. Management Science. (Articles in Advance).

2024

  • Qi, Z.*^, Miao, R.#*, & Zhang, X. (2024). Proximal Learning for Individualized Treatment Regimes Under Unmeasured Confounding. Journal of the American Statistical Association, 119(546), 915–928.

  • Shi, C.*, Qi, Z.*^, Wang, J., & Zhou, F.^ (2024). Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization. Journal of the American Statistical Association, 119(546), 1147–1160.

  • Yu, S., Fang, S., Peng, R., Qi, Z., Zhou, F., & Shi, C. (2024). Two-way Deconfounder for Off-policy Evaluation under Unmeasured Confounding. Advances in Neural Information Processing Systems (NeurIPS).

  • Liu, B., Qi, Z., Zhang, X., & Liu, Y. (2024). Change Point Detection for High-dimensional Linear Models: A General Tail-adaptive Approach. Statistica Sinica. (Accepted).

  • Wang, J., Qi, Z.^, & Wong, R. K. W. (2024). A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric Models. International Conference on Machine Learning (ICML).

  • Hong, M.#, Qi, Z.^, & Xu, Y. (2024). Model-based Reinforcement Learning for Confounded POMDPs. International Conference on Machine Learning (ICML).

  • Hong, M.#, Qi, Z., & Xu, Y. (2024). A Policy Gradient Method for Confounded POMDPs. International Conference on Learning Representations (ICLR).

  • Zhu, J., Wan, R., Qi, Z., Luo, S., & Shi, C. (2024). Robust Offline Policy Evaluation and Optimization with Heavy-Tailed Rewards. International Conference on Artificial Intelligence and Statistics (AISTATS).

2023

  • Wang, J.#, Qi, Z.^, & Wong, R. K. W. (2023). Projected State-Action Balancing Weights for Offline Reinforcement Learning. The Annals of Statistics, 51(4), 1639–1665.

  • Qi, Z., Pang, J.-S., & Liu, Y. (2023). On Robustness of Individualized Decision Rules. Journal of the American Statistical Association, 118(543), 2143–2157.

  • Yang, H.#, Qi, Z.^, Cui, Y., & Chen, P. (2023). Pessimistic Model Selection for Deep Reinforcement Learning. Conference on Uncertainty in Artificial Intelligence (UAI).

  • Dong, J.#, Mo, W., Qi, Z.^, Shi, C., Fang, X., & Tarokh, V. (2023). PASTA: Pessimistic Assortment Optimization. International Conference on Machine Learning (ICML).

  • Zhou, Y.#, Qi, Z., Shi, C., & Li, L. (2023). Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning Approach. International Conference on Artificial Intelligence and Statistics (AISTATS).

2022

  • Liao, P.*, Qi, Z.*^, Wan, R., Klasnja, P., & Murphy, S. (2022). Batch Policy Learning in Average Reward Markov Decision Processes. The Annals of Statistics, 50(6), 3364–3387.

  • Qi, Z., Cui, Y., Liu, Y., & Pang, J.-S. (2022). Asymptotic Properties of Stationary Solutions of Coupled Nonconvex Nonsmooth Empirical Risk Minimization. Mathematics of Operations Research, 47(3), 2034–2064.

  • Miao, R.#, Qi, Z.^, & Zhang, X. (2022). Off-Policy Evaluation for Episodic Partially Observable Markov Decision Processes under Non-Parametric Models. Advances in Neural Information Processing Systems (NeurIPS).

  • Tan, X.#, Qi, Z., Seymour, C., & Tang, L. (2022). RISE: Robust Individualized Decision Learning with Sensitive Variables. Advances in Neural Information Processing Systems (NeurIPS).

  • Chen, X., & Qi, Z.^ (2022). On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation. International Conference on Machine Learning (ICML).

2021 and Prior

  • Qi, Z., Cui, Y., Liu, Y., & Pang, J.-S. (2021). Estimation of Individualized Decision Rules Based on An Optimized Covariate-dependent Equivalent of Random Outcomes. SIAM Journal on Optimization, 31(4), 3119–3148.

  • Mo, W., Qi, Z., & Liu, Y. (2021). Learning Optimal Distributionally Robust Individualized Treatment Rules. Journal of the American Statistical Association, 116(534), 659–674.

  • Mo, W., Qi, Z., & Liu, Y. (2021). Rejoinder to "Learning Optimal Distributionally Robust Individualized Treatment Rules". Journal of the American Statistical Association, 116(534), 685–689.

  • Qi, Z., Liu, D., Fu, H., & Liu, Y. (2020). Multi-armed Angle-based Direct Learning for Estimating Optimal Individualized Treatment Rules with Various Outcomes. Journal of the American Statistical Association, 115(530), 678–691.

  • Zheng, J., Qi, Z., Tan, Y., & Dou, Y. (2019). How Mega is the Mega? Measuring the Spillover Effect of WeChat Using Graphical Models. Information Systems Research, 30(4), 1343–1362.

  • Qi, Z., & Liu, Y. (2019). Convex Bidirectional Large Margin Classifier. Technometrics, 61(2), 176–186.

  • Qi, Z., & Liu, Y. (2018). D-learning to Estimate Optimal Individualized Treatment Rules. Electronic Journal of Statistics, 12(2), 3601–3638.

  • Liang, S., Qi, Z., Qu, S., Zhu, J., Chiu, A. S., Jia, X., & Xu, M. (2016). Scaling of Global Input-output Networks. Physica A: Statistical Mechanics and its Applications, 452, 311–319.

* These authors contributed equally to the manuscript.

# Ph.D. students by the time of submission.

^ Corresponding author


Google Sites
Report abuse
Google Sites
Report abuse