Dr. Junhao (Hao) Wen, Columbia U
Abstract:
This talk encompasses three intertwined yet progressive perspectives: i) scrutinizing the reproducibility of AI/ML in neuroimaging research; ii) depicting the neuroanatomical heterogeneity of brain disorders using AI/ML and imaging; and iii) embracing multi-scale (organs and omics) approaches to investigate brain aging and disease beyond the brain. Integrating AI-driven decision support systems into clinical settings to identify potential genetic, proteomic, metabolomics, and imaging biomarkers for future therapeutic interventions is central to his research interests.
Bio:
Junhao (Hao) Wen, PhD, is a computational neuroscientist with expertise in medical image computing, artificial intelligence/machine learning, multi-omics, and multi-organ bioinformatics. He is the director of imaging genetics research at Columbia's Center for Innovation in Imaging Biomarkers and Integrated Diagnostics (CIMBID). He also holds affiliated appointments at Columbia's Department of Biomedical Engineering (BME), the New York Genome Center (NYGC), and Columbia's Data Science Institute (DSI). He is also a visiting faculty member at the Center for AI and Data Science for Integrated Diagnostics (AI2D) at the University of Pennsylvania and the founding co-chair of the Brain Imaging Genetics workgroup of the International Society for Advancing Alzheimer's Disease Research and Treatment, the largest international Alzheimer's research community.
Dr. Wen’s research focuses on developing and applying artificial intelligence/machine learning techniques to analyze multi-organ and multi-omics biomedical data in the context of human aging and disease, with a particular emphasis on clinical and computational neuroscience to advance precision medicine. His work aims to leverage AI’s capabilities to uncover insights beyond human perception, encompassing several key areas. First, he uses artificial intelligence/machine learning to explore the genetic underpinnings of disease-related neuroanatomical variations (imaging genetics), enabling personalized diagnostic and prognostic approaches. Second, his research adopts a holistic multi-organ and multi-omics framework, recognizing the interconnected nature of organ systems to unravel the complexities of brain structure and function. Furthermore, he leads and contributes to initiatives aimed at consolidating and harmonizing large-scale biomedical datasets, such as the MULTI consortium, which integrates multi-organ and multi-omics data to advance holistic human aging and disease modeling. A central objective of Dr. Wen’s research is to integrate AI-driven decision-support systems into clinical practice, identifying genetic, proteomic, and imaging biomarkers that will inform future therapeutic strategies.
Summary
Focus: AI/ML for medical applications using imaging and multi-omics
Project 1: Can ML cause a reproducibility crisis in science?
Analyzed prior modeling research on Alzeheimer’s disease
Identified many cases of data leakage in their analysis
Trained CNNs on the imaging data,
Separated:
training/validation from
independent test, which was only used after the 2nd round of peer review
Compared 3D CNN and showed that the accuracy is similar to a simpler linear SVM
Evaluated extrapolation accuracy across different datasets, showing reduced accuracy when the data distribution shifts
Demonstrated that
Splitting data across images but not across patients results in overfitting;
Robust models require splitting across patients: all images from one patient are either train or validation, not split across both
Project 2: Heterogeneous dynamics of disease
Using multiple data modalities: MRI, genetics
Clustering disease trajectory signatures
MAGIC: Multi-scale heterogeneity analysis and clustering
Clustering of brain images to identify disease subtypes
Spatially breaking down brain into feature clusters, with different degrees of spatial resolution
Looking at late-life depression
Projecting measurements into a low-dimensional subspace
Tracking patients’ disease progress over time
GAN-based methods for mapping disease heterogeneity
The discriminator compresses imaging data into a low-dimensional latent space
Clustered these latent vectors, which encode the propagation of dementia into 5 subtypes based on brain imaging and symptoms
Subtypes have distinct genetic markers
Project 3:
Link imaging with genetics
MuSIC: Multi-scale structural imaging covariance atlas
Genome-wide association between genetic features and spatial clusters in images
Genetic architecture of multi-modal brain age
Project 4: multi-omics and multi-organ modeling of human aging and disease
Associating 9 phenotype-base aging clocks
Relating phenotypic and generic correlation between different disease phenotypes
Often related
However, where environmental factors are dominant these correlations may have different directionalities
Biological aging clocks
Idea:
Train model to predict chronological age from some type of feature
Look at which features are predictive
11 proteome-based ProBAGs
Plasma proteomics data
Can do age-bias correction
Models tends to regress towards mean age (~45 yo)
Messes up results for diseased patients (e.g. predicted age of diseased patients is younger than chronological)
Correction methods undo this bias by explicitly conditioning on diseased/healthy populations
Organ-specific aging clocks
Observation: prediction of clinical age is not that useful; need to predict actual disease state/diagnosis