Mu Zhou's research covers broad areas of machine learning and AI for pathology, radiology, genomics, clinical data, etc.
Representative publications appear on Nature Machine Intelligence, Lancet Digital Health, Nature Communications, NPJ precision oncology, etc.
His publications receive over 5,000 citations, h-index 29, and i10-index 47 (Jan,2024).
Mining spatial pathology analytics towards precision oncology
Image-based characterization and disease understanding involve integrative analysis of morphological, spatial, and topological information across biological scales. The development of graph convolutional networks (GCNs) creates exciting opportunities to address this information complexity via graph-driven architectures. GCN-based approaches can perform feature aggregation, interaction, and reasoning with remarkable flexibility and efficiency. These capabilities have spawned a new wave of research in medical imaging analysis with a goal of improving quantitative disease understanding, monitoring, and diagnosis. To foster cross-disciplinary research, we address graph technical advancements, emerging medical applications, model interpretation, and large-scale benchmarks that promise to transform the scope of digital pathology and precision oncology.
[1] Ding K, Zhou M, Wang H, Zhang S, Metaxas DN. Spatially aware graph neural networks and cross-level molecular profile prediction in colon cancer histopathology: a retrospective multi-cohort study. The Lancet Digital Health. 2022 Nov 1;4(11):e787-95.
AI-powered in silico drug discovery linking multi-sourced pharmaceutical data
Mining drug–disease association and related interactions are essential for developing in silico drug discoveries. Recently, large-scale biological databases are increasingly available for pharmaceutical research, allowing for deep characterization for molecular informatics and drug discovery. However, the analysis is challenging due to the molecular heterogeneity of disease and diverse drug–disease associations. This area requires deep exploration of a multimodal biological network in an integrative context. We underscore both exploratory and computational analysis for finding novel drug–disease associations, leading to novel drug repurposing (DR) findings, cancer treatment response, and comprehensive understanding underlying biological mechanisms.
[2] Wang Z, Zhou M, Arnold C. Toward heterogeneous information fusion: bipartite graph convolutional networks for in silico drug repurposing. Bioinformatics. 2020 Jul 1.
Pathology AI for cancer genetic outcome prediction and interpretation
The landscape of AI and pathology is changing rapidly based on our evolving ability to process and analyze large-scale whole slide images (WSIs). In the meantime, growing amounts of molecular profiles are defining cancer mechanisms from the use of gene expression, DNA methylation, transcriptomics, and proteomics data. We are interested in quantitative image feature extraction and building strong AI models for numerous downstream clinical tasks.
We develop WSI-based deep-learning classifiers for predicting key mutation outcomes and important biological pathway activities in cancer. We are keen to explore WSI visual interactions between mutation and its related biological pathway, enabling a head-to-head comparison to reinforce our major findings in cancer research. These findings would greatly aid in identifying targeted therapy, inferring prognostic biomarkers, and enhancing our understanding of cancer heterogeneity and patient outcomes.
[3] Qu H, Zhou M, Yan Z, Wang H, Rustgi VK, Zhang S, Gevaert O, Metaxas DN. Genetic mutation and biological pathway prediction based on whole slide images in breast carcinoma using deep learning. NPJ precision oncology. 2021 Sep 23;5(1):87.
Multi-omics integration for cancer drug response prediction
Accurate prediction of cancer drug response (CDR) is challenging due to the uncertainty of drug efficacy and heterogeneity of cancer patients. Strong evidences have implicated the high dependence of CDR on tumor genomic and transcriptomic profiles of individual patients. Precise identification of CDR is crucial in both guiding anti-cancer drug design and understanding cancer biology. We demonstrated that the combination of multi-omics profiles and intrinsic graph-based representation of drugs are appealing for assessing drug response sensitivity. Our research highlighted the predictive power of AI modeling and its potential translational value in guiding disease-specific drug design.
[4] Liu Q, Hu Z, Jiang R, Zhou M. DeepCDR: a hybrid graph convolutional network for predicting cancer drug response. Bioinformatics. 2020 Dec;36:i911-8.
Cancer imaging and genomics
Imaging genomics is a growing field to study the association between imaging and genomic characteristics in cancer. Inherent in this definition is the goal to allow noninvasive imaging assessment as surrogates for molecular signatures that were only available through molecular testing. We develop a key map linking CT image features and gene expression profiles generated by RNA sequencing for patients with non-small cell lung cancer (NSCLC). We define CT semantic features reflecting radiologic information of nodule shape, margin, texture, and tumor environment and overall lung characteristics. Our study highlighted associations between semantic image features and metagenes representing canonical molecular pathways and can result in non-invasive identification of molecular properties of NSCLC.
[5] Zhou M, Leung A, Echegaray S, Gentles A, Shrager JB, Jensen KC, Berry GJ, Plevritis SK, Rubin DL, Napel S, Gevaert O. Non–small cell lung cancer radiogenomics map identifies relationships between molecular and imaging phenotypes with prognostic implications. Radiology. 2018 Jan;286(1):307-15.
Transformer architecture for robust medical applications
Transformer architecture has emerged to be successful in a number of natural language processing tasks. However, its applications to medical vision remain largely unexplored. In this study, we present UTNet, a simple yet powerful hybrid Transformer architecture that integrates self-attention into a convolutional neural network for enhancing medical image segmentation. UTNet applies self-attention modules in both encoder and decoder for capturing long-range dependency at different scales with minimal overhead. To this end, we propose an efficient self-attention mechanism along with relative position encoding that reduces the complexity of self-attention operation significantly to approximate O (n). UTNet demonstrates superior performance, holding the promise to generalize well on medical image segmentations.
[6] Gao Y, Zhou M, Metaxas DN. UTNet: a hybrid transformer architecture for medical image segmentation. InMedical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, September 27–October 1, 2021, Proceedings, Part III 24 2021 (pp. 61-71).
Trustworthy AI: Distributed learning of multi-center medical data
Overcoming barriers of multi-center data analysis is challenging due to privacy protection in the healthcare system. The Protected Health Information (PHI) residing in real medical imaging needs to be well treated at each site when sharing. In this study, we propose Distributed Synthetic Learning (DSL) architecture to learn across multi-medical centers without leaking sensitive personal information. DSL emphasizes the building of a homogeneous data center with entirely synthetic medical images via a form of GAN-based synthetic learning. In particular, DSL architecture is extensible with three key variances: multi-modality learning, missing modality completion learning, and continuous learning over time. The proposed framework demonstrates its strength for integrating multi-center data to support clinical decision making.
[7] Chang Q, Yan Z, Zhou M, Qu H, Zhang H, Baskaran L, Al'Aref S, Metaxas D. Mining Multi-Center Heterogeneous Medical Data with Distributed Synthetic Learning.
Multi-center CT-based models for predicting prognosis of lung cancer patients
Lung cancer is the most common fatal malignancy in adults worldwide, and non-small cell lung cancer (NSCLC) accounts for 85% of lung cancer diagnoses. In this study, we focus on developing a novel AI image model of LungNet, which is a shallow convolutional neural network for predicting outcomes of NSCLC patients. We show that outcomes from LungNet are predictive of overall survival in all four independent survival cohorts. LungNet can be used as a noninvasive predictor for prognosis in NSCLC patients and can facilitate interpretation of CT images for lung cancer stratification, diagnosis, and prognostication.
[8] Mukherjee P, Zhou M, Lee E, Schicht A, Balagurunathan Y, Napel S, Gillies R, Wong S, Thieme A, Leung A, Gevaert O. A shallow convolutional neural network predicts prognosis of lung cancer patients in multi-institutional computed tomography image datasets. Nature machine intelligence. 2020 May; 2(5):274-82.
A complete segmentation mask is crucial for biological image analysis as it delivers important morphological properties such as shapes and volumes. In this research, we propose a region proposal rectification (RPR) module to address this challenging incomplete segmentation problem. In particular, we offer deep-learning models to introduce visual neighbor information into a series of region of interests (ROIs) gradually. Experimental results demonstrate that the proposed method is effective in both anchor-based and anchor-free top-down instance segmentation approaches, suggesting the proposed method can be applied to general top-down instance segmentation of biological cell images.
[9] Zhangli Q, Yi J, Liu D, He X, Xia Z, Chang Q, Han L, Gao Y, Wen S, Tang H, Wang H. Region proposal rectification towards robust instance segmentation of biological images. In Medical Image Computing and Computer Assisted Intervention–MICCAI: 25th International Conference, Proceedings, Part IV 2022 Sep 16 (pp. 129-139).