Selected Publications
*Indicating co-first author; +indicating corresponding author
Data Integration
Gu T, Han Y, Duan R+. Robust angle-based transfer learning in high dimensions. [arXiv]
Gu T, Lee PH, Duan R+. COMMUTE: Communication-efficient transfer learning for multi-site risk prediction. Journal of Biomedical Informatics. 2022 [paper]
Li S, Cai T, Duan R+. Targeting Underrepresented Populations in Precision Medicine: A Federated Transfer Learning Approach. [arXiv]
Yan Z, Zachrison KS, Schwamm LH, Estrada JJ, Duan R+. Fed-GLMM: A Privacy-Preserving and Computation-Efficient Federated Algorithm for Generalized Linear Mixed Models to Analyze Correlated Electronic Health Records Data. 2022 [medRxiv]
Han L, Hou J, Cho K, Duan R+., Cai T+. Federated Adaptive Causal Estimation (FACE) of Target Treatment Effects. [arXiv]
Duan R, Ning Y, Chen Y. Heterogeneity-aware and communication efficient distributed statistical inference. Biometrika (2022). [arXiv][paper]
Luo C, Islam M, Sheils NE, Buresh J, Reps J, Schuemie MJ, Ryan PB, Edmondson M, Duan R, Tong J, Marks-Anglin A. DLMM as a lossless one-shot algorithm for collaborative multi-site distributed linear mixed models. Nature Communications (2022) [paper]
Tong J, Luo C, Islam MN, Sheils NE, Buresh J, Edmondson M, Merkel PA, Lautenbach E, Duan R, Chen Y. Distributed learning for heterogeneous clinical data with application to integrating COVID-19 data across 230 sites. NPJ digital medicine. 2022 [paper]
Liu X, Duan R, Luo C, Ogdie A, Moore JH, Kranzler HR, Bian J, Chen Y. Multisite learning of high-dimensional heterogeneous data with applications to opioid use disorder study of 15,000 patients across 5 clinical sites. Scientific Reports. 2022 [paper]
Li R*, Duan R*, Zhang X*, Lumley T, Pendergrass S, Bauer C, Hakonarson H, Carrell D, Smoller J, Wei W, Carroll R, Edwards D, Wiesner G, Sleiman P, Denny J, Mosley J, Ritchie M, Chen Y, Moore JH. Lossless integration of multiple electronic health records for identifying pleiotropy using summary statistics. Nature Communications 12, 168 (2021). [paper]
Duan R, Boland MR, Liu Z, Liu Y, Chang HH, Xu H, Chu H, Schmid CH, Forrest CB, Holmes JH, Schuemie MJ. Learning from electronic health records across multiple sites: A communication-efficient and privacy-preserving distributed algorithm. Journal of the American Medical Informatics Association. 2020 Mar;27(3):376-85. [paper]
Duan R, Luo C, Schuemie MJ, Tong J, Liang CJ, Chang HH, Boland MR, Bian J, Xu H, Holmes JH, Forrest CB, Morton SC, Berlin JA, Moore JH, Mahoney KB, Chen Y. Learning from local to global: An efficient distributed algorithm for modeling time-to-event data. Journal of the American Medical Informatics Association. 2020 Jul;27(7):1028-1036. [paper]
Duan R, Boland MR, Moore JH, Chen Y. ODAL: A one-shot distributed algorithm to perform logistic regressions on electronic health records data from multiple clinical sites. Pacific Symposium on Biocomputing 2019, 24, 30–41. [paper]
Duan R, Chen Z, Tong J, Luo C, Lyu T, Tao C, Maraganore D, Bian J, Chen Y. Leverage real-world longitudinal data in large clinical research networks for Alzheimer’s disease and related dementia. Annual Symposium proceedings 2020 (in press). [medRxiv]
Genetic Association Test, Risk Prediction and Pleiotropy
Li R*, Duan R*, Zhang X*, Lumley T, Pendergrass S, Bauer C, Hakonarson H, Carrell D, Smoller J, Wei W, Carroll R, Edwards D, Wiesner G, Sleiman P, Denny J, Mosley J, Ritchie M, Chen Y, Moore JH. Lossless integration of multiple electronic health records for identifying pleiotropy using summary statistics. Nature Communications,12, 168 (2021).[paper]
Duan R, Ning Y, Wang S, Lindsay BG, Carroll RJ, Chen Y. A fast score test for generalized mixture models. Biometrics, 76(3), 811–820. [paper]
Li R*, Duan R*, Kember RL, Rader DJ, Damrauer SM, Moore JH, Chen Y. A regression framework to uncover pleiotropy in large-scale electronic health record data. Journal of the American Medical Informatics Association 26(10), 1083–1090. [paper]
Li R, Tong J, Duan R, Chen Y, Moore JH. Evaluation of phenotyping errors on polygenic risk score predictions. Bioinformatics (in press). [BioRxiv]
Measurement Errors and Missing Data in EHR
Duan R, Cao M, Wu Y, Huang J, Denny JC, Xu H, Chen Y. An Empirical Study for Impacts of Measurement Errors on EHR based Association Studies. Annual Symposium proceedings. AMIA Symposium, 2016, 1764–1773. [paper]
Huang J*, Duan R*, Hubbard RA, Wu Y, Moore JH, Xu H, Chen Y. PIE: A prior knowledge guided integrated likelihood estimation method for bias reduction in association studies using electronic health records data. Journal of the American Medical Informatics Association : Journal of the American Medical Informatics Association, 25(3), 345–352. [paper]
Hubbard RA, Tong J, Duan R, Chen Y. Reducing Bias Due to Outcome Misclassification for Epidemiologic Studies Using EHR-derived Probabilistic Phenotypes. Epidemiology, 31(4), 542–550. [paper]
Duan R, Liang CJ, Shaw P, Tang CY, Chen Y. Missing at Random or Not: A Semiparametric Testing Approach. [arXiv]
Pharmacovigilance and Drug Safety
Du J, Huang J, Duan R, Chen Y, Tao C. Comparing the Human Papillomavirus Vaccination Opinions Trends from Different Twitter User Groups with a Machine Learning Based System and Semiparametric Nonlinear Regression. Studies in health technology and informatics, 245, 1218. [paper]
Huang J, Zhang X, Du J, Duan R, Yang L, Moore JH, Chen Y, Tao C. Comparing Different Adverse Effects Among Multiple Drugs Using FAERS Data. Studies in health technology and informatics, 245, 1268. [paper]
Duan R, Zhang X, Du J, Huang J, Tao C, Chen Y. Post-marketing Drug Safety Evaluation using Data Mining Based on FAERS. Data Mining and Big Data : second International Conference, DMBD 2017. [paper]
Huang J, Du J, Duan R, Zhang X, Tao C, Chen Y. Characterization of the Differential Adverse Event Rates by Race/Ethnicity Groups for HPV Vaccine by Integrating Data From Different Sources. Frontiers in pharmacology, 9, 539. [paper]
Zhang X*, Duan R*, Du J, Huang J, Chen Y, Tao C. Comparing Pharmacovigilance Outcomes Between FAERS and EMR Data for Acute Mania Patients. IEEE International Conference on Healthcare Informatics Workshop (ICHI-W) 2018 Jun 4 (pp. 57-59). [paper]
Huang J, Zhang X, Tong J, Du J, Duan R, Yang L, Moore JH, Chen Y, Tao C. Comparing adverse effects of Hepatitis C drugs using FAERS data. IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2018 Dec 3 (pp. 1653-1656). [paper]
Huang J, Zhang X, Tong J, Du J, Duan R, Yang L, Moore JH, Tao C, Chen Y. Comparing drug safety of hepatitis C therapies using post-market data. BMC medical informatics and decision making, 19(Suppl 4), 147. [paper]
Duan R, Zhang X, Du J, Huang J, Tao C, Chen Y. On the evidence consistency of pharmacovigilance outcomes between Food and Drug Administration Adverse Event Reporting System and electronic medical record data for acute mania patients. Health informatics journal, 26(2), 753–764. [paper]
Meta-analysis
Lake ET, Sanders J, Duan R, Riman KA, Schoenauer KM, Chen Y. A Meta-Analysis of the Associations Between the Nurse Work Environment in Hospitals and 4 Sets of Outcomes. Medical care, 57(5), 353–361. [paper]
Wang L, Rouse B, Marks-Anglin A, Duan R, Shi Q, Quach K, Chen Y, Cameron C, Schmid CH, Li T. Rapid network meta-analysis using data from Food and Drug Administration approval packages is feasible but with limitations. Journal of clinical epidemiology, 114, 84–94. [paper]
Hong C, Duan R, Zeng L, Hubbard RA, Lumley T, Riley RD, Chu H, Kimmel SE, Chen Y. The Galaxy Plot: A New Visualization Tool for Bivariate Meta-Analysis Studies. American journal of epidemiology, 189(8), 861–869. [paper]
Luo C, Marks-Anglin A, Duan R, Lin L, Hong C, Chu H, Chen Y. Accounting for small-study effects using a bivariate trim and fill meta-analysis procedure. Statistics in Medicine. [medRxiv]
Duan R, Piao J, Marks-Anglin A, Tong J, Lin L, Chu H, Ning J, Chen Y. Testing for publication bias in meta-analysis under Copas selection model. [arXiv]
Duan R, Tong J, Lin L, Levine LD, Sammel MD, Stoddard J, Li T, Schmid CH, Chu H, Chen Y. PALM: Patient-centered Treatment Ranking via Large-scale Multivariate Network Meta-analysis. [medRxiv]