Preprints
# represents corresponding/senior author, * represents mentee (student or postdoc) under Dr. Liu's supervision
Yao, M.*, Miller, G.W., Vardarajan, B. N., Baccarelli, A. A., Guo, Z.#, and Liu, Z.# (2024). Robust mendelian randomization analysis by automatically selecting valid genetic instruments with applications to identify plasma protein biomarkers for Alzheimer's disease. medRxiv.
"All valid instruments are alike; each invalid instrument is invalid in its own way"
Xihao Li, Han Chen, Margaret Sunitha Selvaraj, Eric Van Buren, Hufeng Zhou, Yuxuan Wang, Ryan Sun, Zachary R. McCaw, Zhi Yu, Donna K. Arnett, Joshua C. Bis, John Blangero, Eric Boerwinkle, Donald W. Bowden, Jennifer A. Brody, Brian E. Cade, April P. Carson, Jenna C. Carlson, Nathalie Chami, Yii-Der Ida Chen, Joanne E. Curran, Paul S. de Vries, Myriam Fornage, Nora Franceschini, Barry I. Freedman, Charles Gu, Nancy L. Heard-Costa, Jiang He, Lifang Hou, Yi-Jen Hung, Marguerite R. Irvin, Robert C. Kaplan, Sharon L.R. Kardia, Tanika Kelly, Iain Konigsberg, Charles Kooperberg, Brian G. Kral, Changwei Li, Ruth J.F. Loos, Michael C. Mahaney, Lisa W. Martin, Rasika A. Mathias, Ryan L. Minster, Braxton D. Mitchell, May E. Montasser, Alanna C. Morrison, Nicholette D. Palmer, Patricia A. Peyser, Bruce M. Psaty, Laura M. Raffield, Susan Redline, Alexander P. Reiner, Stephen S. Rich, Colleen M. Sitlani, Jennifer A. Smith, Kent D. Taylor, Hemant Tiwari, Ramachandran S. Vasan, Zhe Wang, Lisa R. Yanek, Bing Yu, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, Kenneth M. Rice, Jerome I. Rotter, Gina M. Peloso, Pradeep Natarajan, Zilin Li#, Zhonghua Liu#, Xihong Lin# (2024+). A statistical framework for powerful multi-trait rare variant analysis in large-scale whole-genome sequencing studies. bioRxiv 2023.10.30.564764; doi: https://doi.org/10.1101/2023.10.30.564764
Jinglan Dai, Yixin Zhang, Zaiming Li, Hongru Li, Sha Du, Dongfang You, Ruyang Zhang, Yang Zhao, Zhonghua Liu, David C. Christiani, Feng Chen, Sipeng Shen. (2024+) Boosting the power of rare variant association studies by imputation using large-scale sequencing population. medRxiv 2023.10.28.23297722; doi: https://doi.org/10.1101/2023.10.28.23297722
Huang T.-J. , Liu, Z., McKeague I. W. (2024+). Post-selection inference for high-dimensional mediation analysis with survival outcomes. (Under revision)
Wang, A.*, Yao, M.*, Zhang, G., Shi, X., Liu, Z. (2024+). Identifiability and estimation of regression models with differentially misclassified binary outcome ascertained from electronic health records.
Yao, M. and Liu, Z. (2024+) An introduction to causal inference methods with multi-omics data. (Under review)
Kang, H., Guo, Z., Liu, Z., Small, D. (2024+) Identification and inference with invalid instruments. (Invited Review Paper).
Yang, H. Liu, Z., Wang, R., Lai, E., Schwartz, J., Baccarelli, A., Huang, Y., Lin, X. (2024+) Causal mediation analysis for integrating exposure, genomic and phenotype data. (Invited Review Paper).
Published
# represents corresponding author, * represents mentee (student or postdoc) under Dr. Liu's supervision
Statistical Methodology
Robust Mendelian Randomization for Causal Inference with Invalid Instruments
Ye, T., Liu, Z., Sun, B., Tchetgen Tchetgen, E., (2024). GENIUS-MAWII: For Robust Mendelian Randomization with Many Weak Invalid Instruments. Journal of the Royal Statistical Society: Series B (Statistical Methodology) qkae024.
Sun, B., Liu, Z., Tchetgen Tchetgen, E., (2023). Semiparametric Efficient G-estimation with Invalid Instrumental Variables. Biometrika. 110(4), 953-971.
Liu, Z., Ye, T., Sun, B., Schooling, M., Tchetgen Tchetgen, E., (2022). Mendelian randomization mixed-scale treatment effect robust identification and estimation for causal inference. Biometrics, 79, 2208–2219.
Xu, S.*, Wang P., Fung, W.K., Liu, Z.#, (2022). A Novel Penalized Inverse-Variance Weighted Estimator for Mendelian Randomization with Applications to COVID-19 Outcomes. Biometrics, 79, 2184–2195.
Wang, A*., Liu, W*, Liu, Z.#, (2022). A Two-Sample Robust Bayesian Mendelian Randomization Method Accounting for Linkage Disequilibrium and Idiosyncratic Pleiotropy with Applications to the COVID-19 Outcome. Genetic Epidemiology 46, 159– 169. https://doi.org/10.1002/gepi.22445. (one of top 10 most-cited papers among work published betweeen 1 Janurary 2022 - 31 December 2023 in Genetic Epidemiology.)
Causal Mediation Analysis with Applications
Xu, M.*, Feng, R.*, Liu, Z.*, Zhou, X.*, Chen, Y., Cao, Y., Valeri, L. Li, Z., Liu, Z., Cao, S.,Liu, Q., Xie, S., Chang E., Jia, W., Shen, J., Yao, Y., Cai, Y., Zhegn, Y., Zhang, Z., Huang, G., Ernberg, I., Tang, M., Ye, W., Adami, H., Zeng, Y., Lin, X. (2024). Host genetic variants, Epstein-Barr virus subtypes and the risk of nasopharyngeal carcinoma: Assessment of interaction and mediation. Cell Genomics. DOI:https://doi.org/10.1016/j.xgen.2023.100474. (Cover paper on Feb. 14, 2024)
Zhou, Y.*, Wang, W.*, Hu, T., Tong, J., Liu, Z#. (2023) Causal mediation analysis for an ordinal outcome with multiple mediators. Structural Equation Modeling-A Multidisciplinary Journal, 31(2), 205-216.
Tian, P*, Yao, M*, Huang T, Liu Z#. (2022). CoxMKF: A knockoff filter for high-dimensional mediation analysis with a survival outcome in epigenetic studies. Bioinformatics 38(23), 5229-5235.
Wang, W.W.*, Yu, P., Zhou, Y., Tong, T., Liu, Z., (2021). Equivalence of two least-squares estimators for indirect effects. Current Psychology. DOI: https://doi.org/10.1007/s12144-021-02034-6.
Wang, W.W.*, Xu, J., Schwartz, J., Baccarelli, A., Liu, Z.#, (2021). Causal mediation analysis with latent subgroups. Statistics in Medicine. 40( 25): 5628– 5641. DOI: https://doi.org/10.1002/sim.9144.
Liu, Z. #, Shen, J., Barfield, R., Schwartz, J., Baccarelli, A., Lin, X., (2021). Large-Scale hypothesis testing for causal mediation effects with applications in genome-wide epigenetic studies. Journal of the American Statistical Association, 117(537), 67-81, DOI: 10.1080/01621459.2021.1914634
Luo, X., Schwartz, J., Baccarelli, A., Liu, Z. # , 2020. Testing cell-type-specific mediation effects in genome-wide epigenetic studies, Briefings in Bioinformatics, 22(3), bbaa131.
Statistical Genetics and Genomics
Wang, L., Babushkin, N., Liu, Z., Liu, X.# (2024). Trans-eQTL mapping in gene sets identifies network effects of genetic variants. Cell Genomics, 4(4).
Liu, Y., Liu, Z., Lin, X. Ensemble methods for testing a global null. (2023) Journal of the Royal Statistical Society: Series B (Statistical Methodology), qkad131, https://doi.org/10.1093/jrsssb/qkad131
Yang, J.*, Xu, Y.*, Yao, M.*, Wang G., Liu, Z.#. (2023). ERStruct: A Python Package for Inferring the Number of Top Principal Components from Whole Genome Sequencing Data. BMC Bioinformatics. (Yang, J. was a summer RA as an undergrad)
Xu, Y.*, Liu, Z.#, Yao, J., (2022). ERStruct: An eigenvalue ratio approach to inferring population structure from whole genome sequencing data. Biometrics, 79, 891–902. https://doi.org/10.1111/biom.13691
Tian, P*, Hu, Y, Liu, Z.# and Zhang, Y.# (2022). Grace-AKO: A Novel and Stable Knockoff Filter for Variable Selection Incorporating Gene Network Structures. BMC Bioinformatics 23, 478.
Liu, W.*, Xu, Y.*, Wang, A*., Huang, T.#, Liu, Z.#, (2021). The Eigen Higher Criticism and Eigen Berk-Jones Tests for Multiple Trait Association Studies based on GWAS Summary Statistics. Genetic Epidemiology, 46, 89– 104. https://doi.org/10.1002/gepi.22439
(one of top 10 most-cited papers among work published betweeen 1 Janurary 2022 - 31 December 2023 in Genetic Epidemiology.)
Liu, Z., Barnett, I., Lin, X., 2020. A comparison of principal component methods between multiple phenotype regression and multiple SNP regression in genetic association studies, The Annals of Applied Statistics, 14(1), pp.433-451.
Liu, Z. and Lin, X., 2019. A geometric perspective on the power of principal component association tests in multiple phenotype studies, Journal of the American Statistical Association, 114(527), pp.975-990.
Liu, Z. and Lin, X., 2018. Multiple phenotype association tests using summary statistics in genome-wide association studies. Biometrics, 74(1), pp.165-175. (one of top 20 most downloaded paper in 2017-2018 in Biometrics )
Machine (Deep) Learning and others
Liu Q.*, Wang, Z., Li, X., Ji, X., Zhang, L., Liu, L.#, and Liu Z#. (2024). DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation. International Conference on Machine Learning (Accepted).
Chuwdhury, G., Guo, Y.*, Cheung, C., Lam, K., Kam, N., Liu, Z.#, Dai, W.#, (2024) ImmuneMirror: A Machine Learning-based Integrative Pipeline and Web Server for Neoantigen Prediction. Briefings in Bioinformatics, 25(2), bbae024.
Chen, Y., Lam, K. F., Liu, Z. (2024). High-dimensional Feature Screening for Nonlinear Associations With Survival Outcome Using Restricted Mean Survival Time. Stat.
Xu, S*., Liu, L.# and Liu, Z.# (2022) DeepMed: Semiparametric causal mediation analysis with debiased deep learning. The Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), 35, pp. 28238-28251. (Acceptance rate: 25.6% with a total of 10,411 full paper submissions. )
Wang W.W.*, Lu, J., Tong, T., Liu, Z. (2022). Debiased Learning and Forecasting of First Derivative. Knowledge-Based Systems. DOI: https://doi.org/10.1016/j.knosys.2021.107781. (IF=8.038, Computer Science-Artificial Intelligence 16 out of 139).
Health Science Research
Effects of genetically proxied lipid-lowering drugs on acute myocardial infarction: a drug-target Mendelian randomization study", Lipids in Health and Disease.
Zhuang, Z., Li, Y., Zhao, Y., Huang, N., Wang, W., Xiao, W., Du, J., Dong, X., Song, Z., Jia, J., Liu, Z., Clarke, R., Qi, L., & Huang, T. (2024). Genetically determined blood pressure, antihypertensive drug classes, and frailty: A Mendelian randomization study. Aging Cell, 00, e14173. https://doi.org/10.1111/acel.14173
Wong JYY, Blechter B, Liu Z, Shi J, Roger VL. Genetic susceptibility to chronic diseases leads to heart failure among Europeans: the influence of leukocyte telomere length. Hum Mol Genet. 2024 Apr 26:ddae063. doi: 10.1093/hmg/ddae063.
Thomas, A. Ryan, C., Caspi, A., Liu, Z., Moffitt, T., Sugden, K., Zhou, J., Belsky, D., Gu, Y. (2024) Diet, pace of biological aging, and risk of dementia in the Framingham Heart Study. Annals of Neurology: https://doi.org/10.1002/ana.26900
Zhao J#, Yao M.*, Liu, Z.# (2024) Using genetics and proteomics data to identify proteins causally related to COVID-19, healthspan and lifespan: A Mendelian randomization study. Aging (Albany NY). 2024 Apr 3; 16:6384-6416 . https://doi.org/10.18632/aging.205711
Zhuang, Z., Zhao, Y., Song, Z., Wang, W., Huang, N., Dong, X., Xiao, W., Li, Y., Jia, J., Liu, Z. and Qi, L., Huang, T. (2023.) Leisure-Time Television Viewing and Computer Use, Family History, and Incidence of Dementia. Neuroepidemiology, 57(5), pp.304-315.
Li, Y., Zhang, L, Zeng, Z., Zhuang, Z., Wang, W., Song, Z., Zhao, Y., Dong, X., Xiao, W., Huang, N., Jia, J., Liu, Z., Qi, L., Li, L., Huang, T., (2023). Polysocial and Polygenic Risk Scores and All-cause Dementia, Alzheimer's disease, and Vascular Dementia. Journals of Gerontology Series A: Biomedical Sciences and Medical Sciences, glad262, https://doi.org/10.1093/gerona/glad262.
Kampaktsis P. N., Bohoran, T. A., McLaughlin L., Leb J., Liu Z., Moustakidis S., Siouras A., Singh A., McCann G. P., Giannakidis A. (2023). An attention-based deep learning method for right ventricular quantification using 2D echocardiography: feasibility and accuracy. Echocardiography. 2023; 1-9. https://doi.org/10.1111/echo.15719
Song, Z, Wang, W, Zhao, Y, Xiao, W., Du, J., Liu, Z., Huang T., Tang Y. (2023). Observational and genetic associations of adiposity with cardiopulmonary multimorbidity: Linear and nonlinear Mendelian randomization analysis. Obesity (Silver Spring). 1-11. doi:10.1002/oby.23934
Wang W, Huang N, Zhuang Z, Song Z, Li Y, Dong X, Xiao W, Zhao Y, Jia J, Liu Z, Qi L, Huang T. (2023) Identifying Potential Causal Effects of Telomere Length on Health Outcomes: A Phenome-Wide Investigation and Mendelian Randomization Study, The Journals of Gerontology: Series A, 2023;, glad128, https://doi.org/10.1093/gerona/glad128
Zhuang Z., Dong X., Jia J., Liu Z., Huang T., Qi L., PhD, (2023) Sleep patterns, plasma metabolome and risk of incident type 2 diabetes, The Journal of Clinical Endocrinology & Metabolism, dgad218, https://doi.org/10.1210/clinem/dgad218
Huang X, Yao M*, Tian P, Wong J, Li, Z, Liu Z#, Zhao J#. (2023). Shared genetic etiology and causality between COVID-19 and venous thromboembolism: evidence from genome-wide cross trait analysis and bi-directional Mendelian randomization study Communications Biology.
Alameda L, Liu Z, Sham P, et al.(2023) Exploring the mediation of DNA methylation across the epigenome between childhood adversity and First Episode of Psychosis – findings from the EU-GEI study. Molecular Psychiatry, https://doi.org/10.1038/s41380-023-02044-9.
Huang, N., Zhuang, Z., Song, Z., Wang, W., Li, Y., Zhao, Y., Xiao, W., Dong, X., Jia, J., Liu, Z. and Smith, C.E., Huang T. (2023 )Associations of Modified Healthy Aging Index With Major Adverse Cardiac Events, Major Coronary Events, and Ischemic Heart Disease. Journal of the American Heart Association.
Yao M*, Huang X, Guo Y, Zhao J#, Liu Z#. (2023). Disentangling the common genetic architecture and causality of rheumatoid arthritis and systemic lupus erythematosus with COVID-19 outcomes: genome-wide cross trait analysis and bi-directional Mendelian randomization study Journal of Medical Virology .
Dong, X., Zhuang, Z., Zhao, Y., Song, Z., Xiao, W., Wang, W, Li, Y., Huang, N., Jia, J., Liu, Z., Qi, L., Huang, T. (2023). Unprocessed red meat and processed meat consumption, plasma metabolome, and risk of ischemic heart disease: a prospective cohort study of UK Biobank. Journal of the American Heart Association.
Huang N, Zhuang Z, Liu Z and Huang T# (2022). Observational and genetic associations of modifiable risk factors with
aortic valve stenosis: a prospective cohort study of 0.5 million participants. Nutrients, 14(11), 2273; https://doi.org/10.3390/nu14112273
Zhu Z, Wang K, Hao X, Chen L, Liu Z., Wang C#. (2022). Causal graph between serum lipids and glycemic traits: a Mendelian randomization study. Diabetes (IF=7.720, Endocrinology and Metabolism 9 out of 143)
Zhuang, Z., Li, N., Wang J., Yang, R., Wang, W., Liu, Z., Huang, T., (2022) GWAS-associated bacteria and their metabolites appear to be causally related to the development of inflammatory bowel disease. European Journal of Clinical Nutrition. https://doi.org/10.1038/s41430-022-01074-w
Chen J., Shen S., Li Y., Fan J., Xiong S., Xu J., Zhu C., Lin L., Dong X., Duan W., Zhao Y., Qian X., Liu Z., Wei Y., Christiani D., Zhang R., Chen F., APOLLO: An accurate and independently validated prediction model of lower-grade gliomas overall survival and a comparative study of model performance, EBioMedicine, Volume 79, 2022, 104007, ISSN 2352-3964, https://doi.org/10.1016/j.ebiom.2022.104007. (IF=8·143 , research and experimental medicine 17/140)
Zhou, X., Cao, S.M., Cai, Y., Zhang, X., Zhang, S., Feng, G.F., Chen, Y., Feng, Q.S., Chen, Y., Chang, E.T., Liu, Z., Adami, H.O., Liu, J., Ye, W., Zhang, Z., Zeng, Y.X., Xu, M., 2021. A Comprehensive Risk Score for Effective Risk Stratification and Screening of Nasopharyngeal Carcinoma. Nature Communications 12 (1), 1-8. (IF=14.919, Multidisciplinary Sciences 4 out of 73).
Liu, W.*, Zhuang, Z., Wang, W., Huang, T#. and Liu, Z#. 2021. An Improved Genome-Wide Polygenic Score Model for Predicting the Risk of Type 2 Diabetes. Frontiers in Genetics. DOI: https://doi.org/10.3389/fgene.2021.632385.
Wang, W., Wang, J., Zhuang, Z., Gao, M., Yang, R., Liu, Z., & Huang, T. (2021). Assessment of causality between modifiable factors and heart failure: A Mendelian randomization analysis. Asia Pacific Journal of Clinical Nutrition, 30(2). https://search.informit.org/doi/10.3316/informit.935636036920046
Shen, S., Zhang, R., Jiang, Y., Li, Y., Lin, L., Liu, Z., Zhao, Y., Shen, H., Hu, Z., Wei, Y.# and Chen, F.#, 2021. Comprehensive analyses of m6A regulators and interactive coding and non-coding RNAs across 32 cancer types. Molecular Cancer 20 (67). DOI: https://doi.org/10.1186/s12943-021-01362-2 (IF=15.302, Biochemistry & Molecular Biology 5 out of 297, Oncology 10 out of 244).
Zhuang, Z.,Yao, M.*, Wong, J. Y.Y. , Liu, Z.#, Huang, T#., 2021. Shared genetic etiology and causality between body fat percentage and cardiovascular diseases: a large-scale genome-wide cross-trait analysis. BMC Medicine 19, 100. https://doi.org/10.1186/s12916-021-01972-z (IF=8.775, Medicine, General and Internal 10 out of 155).
Liu, W.*, Guo, Y.*, and Liu, Z.#, 2021. An Omnibus Test for Detecting Multiple Phenotype Associations based on GWAS Summary Level Data. Frontiers in Genetics. DOI: https://doi.org/10.3389/fgene.2021.644419.
Wei, Y., Huang, H., Zhang, R., Zhu, Z., Zhu, Y., Lin, L., Dong, X., Wei, L., Chen, X., Liu, Z., Zhao, Y., Su, L., Chen, F.# and Christiani, D.C.#, 2021. Association of Serum Mannose With Acute Respiratory Distress Syndrome Risk and Survival. JAMA Network Open, 4(1):e2034569. (IF=5.032)
Zhuang, Z., Gao, M., Yang, R., Li, N., Liu, Z., Cao, W. and Huang, T., 2020. Association of physical activity, sedentary behaviours and sleep duration with cardiovascular diseases and lipid profiles: a Mendelian randomization analysis. Lipids in Health and Disease, 19, pp. 1-11.
Zhuang, Z., Gao, M., Yang, R., Liu, Z., Cao, W., and Huang, T., 2020. Causal relationships between gut metabolites and Alzheimer’s disease: a bi-directional Mendelian randomization study. Neurobiology of Aging. DOI: 10.1016/j.neurobiolaging.2020.10.022.
Bind, M.A., Rubin, D.B., Cardenas, A., Dhingra, R., Ward-Caviness, C., Liu, Z., Mirowsky, J., Schwartz, J.D., Diaz-Sanchez, D. and Devlin, R.B., 2020. Heterogeneous ozone effects on the DNA methylome of bronchial cells observed in a crossover study. Scientific Reports, 10(1), 1-15.
Jia, J., Dou, P., Gao, M., Kong, X., Li, C., Liu, Z.# and Huang, T.# , 2019. Assessment of causal direction between gut microbiota-dependent metabolites and cardiometabolic health: A bi-directional Mendelian randomisation analysis. Diabetes., 68(9), pp.1747-1755. (IF=7.720, Endocrinology and Metabolism 9 out of 143)
Peter Brown, RELISH Consortium, Yaoqi Zhou, Large expert-curated database for benchmarking document similarity detection in biomedical literature search, Database, Volume 2019, 2019, baz085, https://doi.org/10.1093/database/baz085
Brunst, K.J., Tignor, N., Just, A., Liu, Z. , Lin, X., Hacker, M.R., Bosquet, E.M., Wright, R.O., Wang, P., Baccarelli, A.A. and Wright, R.J., 2018. Cumulative lifetime maternal stress and epigenome-wide placental DNA methylation in the PRISM cohort. Epigenetics. 13(6):665-681.
Zhang, J., Liu, Z. , Umukoro, P.E., Cavallari, J.M., Fang, S.C., Weisskopf, M.G., Lin, X., Mittleman, M.A. and Christiani, D.C., 2017. An epigenome-wide association analysis of cardiac autonomic responses among a population of welders. Epigenetics, 12(2), pp.71-76.
Liu, X.S., Liu, Z. , Gerarduzzi, C., Choi, D.E., Ganapathy, S., Pandolfi, P.P. and Yuan, Z.M., 2016. Somatic human ZBTB7A zinc finger mutations promote cancer progression. Oncogene, 35(23), p.3071. (IF=7.971)
Qi, Qibin, et al. FTO genetic variants, dietary intake and body mass index: insights from 177 330 individuals. Human Molecular Genetics, 23.25 (2014): 6961-6972. (IF=5.101)