Publications
Luo, A., Du, J., Tian, F., Xian, X., Specht, R., Wang, G., Bi, X., Fleming, C., Srinivasa, J., Kundu, A., Hong, M. and Ding, J. (2025). Can agentic AI match the performance of human data scientists? IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), invited paper.
Luo, A., Xian, X., Du, J., Tian, F., Wang, G., Zhong, M., Zhao, S., Bi, X., Liu, Z., Zhou, J., Srinivasa, J., Kundu, A., Fleming, C., Hong, M. and Ding, J. (2025). AssistedDS: Benchmarking how external domain knowledge assists LLMs in automated data science. Conference on Empirical Methods in Natural Language Processing (EMNLP).
Qiu, L., Tang, C., Bi, X., Burtch, G., Chen, Y. and Zhang, H. (2025). Physician use of large language models: Quantitative study based on large-scale query-level data. Journal of Medical Internet Research, forthcoming.
Xian, X., Wang, G., Bi, X., Zhang, R., Srinivasa, J., Kundu, A., Fleming, C., Hong, M. and Ding, J. (2025). On the vulnerability of applying retrieval-augmented generation within knowledge-intensive application domains. International Conference on Machine Learning (ICML).
Xian, X., Wang, G., Bi, X., Srinivasa, J., Kundu, A., Hong, M. and Ding, J. (2024). RAW: A robust and agile plug-and-play watermark framework for AI-generated images with provable guarantees. Conference on Neural Information Processing Systems (NeurIPS).
Yang, M. and Bi, X. (2024). Cost-aware calibration of classifiers. INFORMS Journal on Data Science. 4, 101-113.
Wang, G., Xian, X., Kundu, A., Srinivasa, J., Bi, X., Hong, M. and Ding, J. (2024). Demystifying poisoning backdoor attacks from a statistical perspective. International Conference on Learning Representations (ICLR).
Chen, J., He, L., Liu, H., Yang, Y. and Bi, X. (2024). Background music recommendation on short video sharing platforms. Information Systems Research, 35, 1890-1908.
Travadi, Y., Peng, L., Bi, X., Sun, J. and Yang, M. (2024). Welfare and fairness dynamics in federated learning: A client selection perspective. Statistics and Its Interface, 17, 383-395.
Xian, X., Wang, G., Srinivasa, J., Kundu, A., Bi, X., Hong, M. and Ding, J. (2023). A Unified Framework for Inference-Stage Backdoor Defenses. Conference on Neural Information Processing Systems (NeurIPS).
Bi, X., Gupta, A. and Yang, M. (2023). Understanding partnership formation and repeated contributions in federated learning: An analytical investigation. Management Science, 70, 4974-4994.
Xian, X., Wang, G., Srinivasa, J., Kundu, A., Bi, X., Hong, M. and Ding, J. (2023). Understanding backdoor attacks through the adaptability hypothesis. International Conference on Machine Learning (ICML).
Bi, X., Yang, M. and Adomavicius, G. (2023). Consumer acquisition for recommender systems: A theoretical framework and empirical validations. Information Systems Research, 35, 339-362.
Bi, X. and Shen, X. (2023). Distribution-invariant differential privacy. Journal of Econometrics, 235, 444-453. [Python code]
Shen, X., Bi, X. and Shen, R. (2022). Data flush. Harvard Data Science Review, 4(2).
Bi, X.*, Feng, L.*, Li, C.* and Zhang, H. (2022). Modeling pregnancy outcomes through sequentially nested regression models. Journal of the American Statistical Association, 117, 602-616.
Bi, X., Adomavicius, G., Li, W. and Qu, A. (2022). A temporal recommendation engine for product sales forecasting. INFORMS Journal on Computing, 34, 1644-1660.
Zhang, Y., Bi, X., Tang, N.-S. and Qu, A. (2021). Dynamic tensor recommender systems. Journal of Machine Learning Research, 22, 1-35.
Feng, L.*, Bi, X.* and Zhang, H. (2021). Brain regions identified as being associated with verbal reasoning through the use of imaging regression via internal variation. Journal of the American Statistical Association, 116, 144-158.
Media Coverage: HealthITAnalytics, University of Minnesota News and Events
Bi, X.*, Tang, X.*, Yuan, Y.*, Zhang, Y.* and Qu, A.* (2021). Tensors in Statistics. Annual Review of Statistics and Its Application, 8, 345-368.
Tang, X., Bi, X. and Qu, A. (2020). Individualized multilayer tensor learning with an application in imaging analysis. Journal of the American Statistical Association, 115, 836-851.
Winner of Student Paper Award (ASA SLDS Section, 2017)
Wang, Y., Bi, X. and Qu, A. (2020). A logistic factorization model for recommender systems with multinomial responses. Journal of Computational and Graphical Statistics, 29, 396-404.
Bi, X., Feng, L., Wang, S., Lin, Z., Li, T., Zhao, B., Zhu, H. and Zhang, H. (2019). Common genetic variants have associations with human cortical brain regions and risk of schizophrenia. Genetic Epidemiology, 43, 548-558.
Bi, X. and Qu, A. (2018). A mixed-effects estimating equation approach to nonignorable missing longitudinal data with refreshment samples. Statistica Sinica, 28, 1653-1675. [pdf] [supp]
Bi, X., Qu, A. and Shen, X. (2017). Multilayer tensor factorization with applications to recommender systems. Annals of Statistics, 46, 3308-3333. [pdf][Matlab code]
Zhang, H., Liu, D., Zhao, J. and Bi, X. (2018). Modeling hybrid traits for comorbidity and genetic studies of alcohol and nicotine co-dependence. Annals of Applied Statistics, 12, 2359-2378.
Bi, X., Qu, A., Wang, J. and Shen, X. (2017). A group-specific recommender system. Journal of the American Statistical Association, 112, 1344-1353. [pdf] [supp][R package][Tensorflow code][Matlab code]
Winner of Student Paper Award (ASA SLDS Section, 2016)
Bi, X., Yang, L., Li, T., Wang, B., Zhu, H. and Zhang, H. (2017). Genome-wide mediation analysis of psychiatric and cognitive traits through imaging phenotypes. Human Brain Mapping, 38, 4088-4097. [pdf]
Clark, S. E., Purcell, J. E., Bi, X. and Fortman, J. D. (2017). Cross-foster rederivation compared with antibiotic administration in the drinking water to eradicate bordetella pseudohinzii. Journal of the American Association for Laboratory Animal Science, 56, 1-5. [link]
Liechty, J. M., Bi, X. and Qu, A. (2016). Feasibility and validity of a statistical adjustment to reduce self-reported bias of height and weight in wave 1 of the Add Health Study. BMC Medical Research Methodology, 16(1): 124. [pdf]
Bi, X. and Qu, A. (2015). Sufficient dimension reduction for longitudinal data. Statistica Sinica, 25, 787-807. [pdf] [supp]
*Equal contribution
Patents
Bi, X. and Shen, X. Distribution-invariant data protection mechanism. US Patent 12,248,613, filed March 31, 2022, and issued March 11, 2025.