Selected Peer-Reviewed Publications (see my google profile for a complete list)
The most representative papers that I (and/or my student) was the primary contributor(s) are marked in red. Mentees under my supervision are underscored. * for co-first authors and † for corresponding author(s).
Statistical Theory and Methodology
Chen G, Sullivan PF, Kosorok MR: Biclustering with heterogeneous variance. Proceedings of the National Academy of Sciences, 2013, 110(30): 12253--12258. (direct submission) [code]
Chen G, Liu Y, Shen D, Kosorok MR: Composite large margin classifiers with latent subclasses for heterogeneous biomedical data. Statistical Analysis and Data Mining, 2016, 9(2): 75--88. (an earlier version won ENAR 2015 student paper award).
Chen G, Zeng D, Kosorok MR: Personalized dose finding using outcome weighted learning (with discussion). Journal of the American Statistical Association, Theory and Methods, 2016, 111(516): 1509--1521. [code]
Chen G, Zeng D, Kosorok MR: Rejoinder ``Personalized dose finding using outcome weighted learning''. Journal of the American Statistical Association, Theory and Methods, 2016, 111(516): 1543--1547.
Tang ZZ*†, Chen G*†, Alekseyenko A: PERMANOVA-S: Association test for microbial community composition accommodating confounders and multiple distances. Bioinformatics, 2016, 32(17): 2618--2625. [code]
Zhu R*, Zhao YQ*, Chen G*, Ma S, Zhao H: Greedy tree learning of optimal personalized treatment rules. Biometrics, 2017, 73(2), 391--400.
Tang ZZ, Chen G, Alekseyenko A, Li H: A general framework for association analysis of microbial community on a taxonomic tree. Bioinformatics, 2017, 33(9): 1278--1285.
Tang ZZ, Chen G: Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis. Biostatistics, 2019, 20(4): 698--713.
Helgeson E, Liu Q, Chen G, Kosorok MR, Bair E: Biclustering via sparse clustering. Biometrics, 2020, 76(1): 348--358.
Zhao YQ, Zhu R, Chen G, Zheng Y: Constructing dynamic treatment regimes with shared parameters for censored data. Statistics in Medicine, 2020, 39(9): 1250--1263.
Giganti M, Shaw PA, Chen G, Bebawy S, Turner M, Sterling TR, Shepherd BE: Accounting for dependent errors in predictors and time-to-event outcomes using electronic health records, validation samples, and multiple imputation. Annals of Applied Statistics, 2020, 14(2): 1045--1061.
Tang ZZ, Sliwoski GR, Chen G, Jin B, Bush WS, Li B, Carpra TA: Scan tests guided by protein structures discover rare-variant associations and signal regions. Genome Biology, 2020, 21:217.
Tang ZZ, Chen G: Robust and powerful differential composition tests for clustered microbiome data. Statistics in Biosciences, 2021, 13: 200--216.
Huling JD, Smith MA, Chen G†: A two-part framework for estimating individualized treatment rules from semi-continuous outcomes. Journal of the American Statistical Association, 2021, 116(533): 210--223. [code]
Chen R, Chen G†, Yu M: A generalizability score for aggregate causal effects. Biostatistics, 2023, 24(2): 309--326. [code]
Chen G†, Li X, Yu M: Policy learning for optimal individualized dose intervals. Proceedings of The 25th International Conference on Artificial Intelligence and Statistics (AISTATS), PMLR, 2022, 151:1671--1693. [code]
Cheng J, Huling JD, Chen G†: Meta-analysis of individualized treatment rules via sign-coherency. Proceedings of the 2nd Machine Learning for Health symposium, PMLR, 2022, 193:171--198. [code]
Chen R, Chen G†, Yu M: Entropy balancing for causal generalization with target sample summary information. Biometrics, 2023, 79(4): 3179-3190. [code]
Hong Q, Chen G, Tang ZZ: PhyloMed: a phylogeny-based test of mediation effect in microbiome. Genome Biology, 2023, 11;24(1):72.
Maronge JM, Huling JD, Chen G†, A reluctant additive model framework for interpretable nonlinear individualized treatment rules. Annals of Applied Statistics, 2023, 17(4): 3384--3402. (Preprint available at arXiv:2311.01538) [code]
Huling JD†, Greifer N, Chen G†, Independence weights for causal inference with continuous treatments. Journal of the American Statistical Association, Theory and Methods, 2024, 119(546):1657--1670. [code], also implemented in Noah's WeightIt package.
Chen R, Huling JD, Chen G†, Yu M: Robust sample weighting to facilitate individualized treatment rule learning for a target population. Biometrika, 2024, 111(1): 309--329. (Preprint available at: arXiv:2105.00581) [code]
Park C, Chen G, Yu M, Kang H: Minimum Resource Threshold Policy Under Partial Interference. Journal of the American Statistical Association, Theory and Methods, 2024, 119(548): 2881–-2894. (Preprint available at: arXiv:2111.09932)
Lee J, Huling JD, Chen G†: An effective framework for estimating individualized treatment rules. NeurIPS, 2024. [code]
Chen G†, Wang X, Sun Q, Tang ZZ: Multidimensional scaling improves distance-based clustering for microbiome data. Bioinformatics. 2025, in press. [code]
Wei Z, Chen G, Tang ZZ. Melody: Meta-analysis of Microbiome Association Studies for Discovering Generalizable Microbial Signatures. Genome Biology, 2025, in press.
Wei Z, Hong Q, Chen G, Hartert T, Rosas-Salazar C, Das S, Shilts M, Levin A, Tang ZZ. Fast and reliable association discovery in large-scale microbiome studies and meta-analyses using PALM. Genome Biology, 2025, in press.
Chen Y, Chen G, Yu M: Confidence Interval Construction for Causally Generalized Estimates with Target Sample Summary Information. Statistics in Medicine, 2025, in press.
Davis SE, Lasko TA, Chen G, Siew ED, Matheny ME: Temporal calibration of regression and machine learning models for acute kidney injury. Journal of the American Medical Informatics Association, 2017, 24(6): 1052--1061.
Davis SE, Lasko TA, Chen G, Matheny ME: Calibration drift among regression and machine learning models for hospital mortality. American Medical Informatics Association (AMIA) Annual Symposium Proceedings, 2017: 625--634. (``Best of Student Papers in Knowledge Discovery and Data Mining'' Awards)
Tang ZZ*, Chen G*, Hong Q, Huang S, Smith HM, Scholz M, Reilly PM, Ferguson JF: Multi-omic analysis of the microbiome and metabolome in healthy subjects reveals microbiome-dependent relationships between diet and metabolites. Frontier in Genetics, 2019, 10:454.
Dennis J, Sealock JM, Straub P, Hucks D, Actkins KE, Faucon A, Goleva S, Nirachou M, Singh K, Morley T, Ruderfer D, Mosley JD, Chen G, Davis LK: Clinical laboratory test-wide association scan of polygenic risk scores identifies biomarkers of complex disease. Genome Medicine, 2021, 13:6.
Gao J, Mar P, Chen G†: More generalizable models for sepsis detection under covariate shift. American Medical Informatics Association (AMIA) Informatics Summit Proceedings, 2021, 220--228.
Yan Y, Schaffter T, Bergquist T, Yu T, Prosser J, Aydin Z, Jabeer A, Brugere I, Gao J, Chen G, Causey J, Yao Y, Bryson K, COVID-19 Prediction DREAM Challenge Consortium, Mooney SD, Guinney J: Continuously benchmarked, crowdsourced challenge for rapid development, evaluation of models to predict COVID-19 diagnosis, hospitalization. JAMA Network Open, 2021, 4(10):e2124946. (Our group is one of the top performing teams in COVID-19 Prediction DREAM Challenge)
Gao J, He S, Hu J, Chen G†, A Hybrid System to Understand the Relations between Assessments and Plans in Progress Notes. Journal of Biomedical Informatics, 2023, 141:104363. (Preliminary version of the proposed method is among the top-performing solutions for Task 3 of 2022 N2C2 Challenge)
Bergquist T, Wax M, ..., Gao J, Chen G, ..., Pediatric COVID-19 Data Challenge Consortium, Patel S. A Framework for Future National Pediatric Pandemic Respiratory Disease Severity Triage: The HHS Pediatric COVID-19 Data Challenge. Journal of Clinical and Translational Science, 2023, 7(1):e175. (Our group is the winner of Task 1 of Pediatric COVID-19 Data Challenge)
Bergquist T, Schaffter T, Yan Y, Yu T, Prosser J, Gao J, Chen G, Charzewski L, Nawalany Z, Brugere I, Retkute R, Prusokas A, Prusokas A, Choi Y, Kang J, Kim S, Choe J, Lee I, Lee S, Patient Mortality Prediction DREAM Challenge Consortium, Mooney SD, Guinney J: Evaluation of crowd sourced mortality prediction models as a framework for assessing AI in medicine. Journal of the American Medical Informatics Association, 2024, 31 (1), 35-44. (Our group is the winner of Patient Mortality Prediction DREAM Challenge)
Golob JL, Oskotsky TT, T, ..., Gao J, Chen G, ... , The Preterm Birth DREAM Community, Costello JC, M Sorita: Microbiome preterm birth DREAM challenge: Crowdsourcing machine learning approaches to advance preterm birth research. Cell Reports Medicine, 2024, 5(1): 101350. (Our group is the winner of Task 1 of Microbiome preterm birth DREAM challenge)
Gao J, Chen G, O’Rourke AP, Caskey J, Carey K, Oguss M, Stey A, Dligach D, Miller T, Mayampurath A, Churpek MM, Afshar M. Automated stratification of trauma injury severity across multiple body regions using multi-modal, multi-class machine learning models. Journal of the American Medical Informatics Association, 2024, 31 (6), 1291-1302. (Preprint available at: Medrxiv. 2024 Jan 22:2024-01.)
Gao J, Mar P, Tang ZZ, Chen G†. Fair Prediction of Two-year Stroke Risk in Patients with Atrial Fibrillation using All of Us Data, Journal of the American Medical Informatics Association, 2024, 31(12):2820-8.
Kruse M, Afshar M, Khatwani S, Mayampurath A, Chen G, Gao Y. An Information-Theoretic Perspective on Multi-LLM Uncertainty Estimation. EMNLP,2025.
Croxford E, Gao Y, First E, Pellegrino N, Schnier M, Caskey J, Oguss M, Wills G, Chen G, Dligach D, Churpek MM. Evaluating clinical AI summaries with large language models as judges. npj Digital Medicine. 2025, 8(1):640.
Gao J, Rahman M, Caskey J, Oguss M, O’Rourke A, Brown R, Mayampurath A, Churpek MM, Chen G†, Afshar M†: Mixture-of-Multimodal-Agents (MoMA) Architecture for Enhancing Clinical Prediction Modeling. npj Digital Medicine, 2025, in press.
Wright FA, Sullivan PF, … , Chen G, … , Boomsma DI (42 total authors): Heritability and genomics of gene expression in peripheral blood. Nature Genetics, 2014, 46(5): 430--437.
Dill-McFarland KA, Tang ZZ, Kreznar JH, Kerby RL, Chen G, Palloni A, Sorenson T, Rey FE, Herd P: Close social relationships correlate with human gut microbiome composition. Scientific Reports, 2019, 9(1), 703.
Gossa M, Temta JL, Barlow S, Temte E, Bell C, Birstler J, Chen G: An assessment of parental knowledge, attitudes, and beliefs regarding Influenza vaccination. Vaccine, 2020, 38(6): 1565--1571.
Mar PL, John A, Kumar S, Barry N, Chen G, Longserre S, Kabra R, Atkins D, Koerber S, Hussein A, Bhakta D, Lakkireddy D, Gopinathannair R: Management and long-term outcomes associated with recalled implantable cardioverter-defibrillator leads: A Multicenter Experience. Heart Rhythm, 2020, 17(11):1909--1916.
Duffy S, Norton D, Kelly M, Chavez A, Tun R, Ramírez M, Chen G, Wise P, Svenson J: Using community health workers and a smartphone application to improve diabetes care in rural Guatemala. Global Health: Science and Practice, 2020, 8(4): 699--720.
Azari DP, Miller BL, Le BV, Greenberg JA, Bruskewitz R, Long KL, Shada AL, Chen G, Radwin RG: A Comparison of expert ratings and marker-less hand tracking along OSATS-derived motion scales. IEEE Transactions on Human-Machine Systems, 2021, 51(1): 22--31.
Sealock JM, Lee H, Moscati A, Venkatesh S, Voloudakis G, Straub P, Singh K, Roussos P, Smoller J, Chen G, Davis LK: Use of the PsycheMERGE network to investigate the association between depression polygenic scores and white blood cell count. JAMA Psychiatry, 2021, 78(12):1365--1374.
Sealock JM, Ziogas IA, Zhao A, Ye F, Alexopoulos SP, Matsuoka L, Chen G†, Davis LK†: Proposing a Sex-Adjusted Sodium-Adjusted MELD Score for Liver Transplant Allocation. JAMA Surgery, 2022, 157(7):618--626.
Xu R, Chen G, Connor M, Murphy J: Novel use of patient-specific covariates from oncology studies in the era of biomedical data science: a review of latest methodologies. Journal of Clinical Oncology, 2022, 40(30): 3546--3553.
Eickhoff J, Zaborek J, Chen G, Sahasrabuddhe V, Ford L, Szabo E, Kim K: A systematic review and pooled analysis of hypothesized versus observed effect sizes in early phase cancer prevention clinical trials. Cancer Prevention Research, 2023, 16(8): 471-478.
Colnet B, Mayer I, Chen G, Dieng A, Li R, Varoquaux G, Vert JP, Josse J, Yang S. Causal inference methods for combining randomized trials and observational studies: a review. Statistical Science, 2024, 39(1):165-191.