Sir John Aston
The Keith Worsley mindset - Interesting Maths/Stats for Genuine Problems
Heather Shappell
Hidden Semi-Markov Models with Covariate-Dependent Dwell Times: Methods and Application to Brain Network Dynamics in Weight Loss
Hidden semi-Markov models (HSMMs) extend hidden Markov models by explicitly modeling state dwell times, providing greater flexibility for characterizing temporal dynamics. However, existing HSMM formulations for dynamic brain network analyses typically assume that dwell-time distributions are homogeneous across subjects, limiting their ability to assess associations with covariates. In this work, we extend the HSMM framework so that dwell times can depend on subject-level covariates. We do this by introducing a regression structure for the dwell-time distributions, considering both Poisson and Gamma models. Parameters are estimated using an expectation–maximization algorithm, and we use bootstrap resampling across participants to obtain standard errors and carry out inference on the regression coefficients. We first evaluate the approach in simulations, where it is able to recover covariate effects across a range of settings. We then apply the model to a study of weight loss in older adults, focusing on how Power of Food Scale scores relate to the amount of time individuals spend in different brain network states during a food cue task.
Thomas Yeo
Widespread use of invalid statistical tests to compare machine learning models
Machine learning is accelerating biomedical research. Cross-validation is widely used to compare predictive performance – not only to benchmark algorithms, but also to inform scientific applications, such as ranking biomarkers. However, prediction performance estimates across cross-validation folds are not independent. Standard tests for comparing prediction performance (e.g., paired t-test) assume independence and can therefore inflate false positive rates. In a PRISMA-guided meta-analysis of 210 studies (impact factor ≥15, 1 June 2020 – 1 June 2025), we find that 97% ignored fold dependence when comparing prediction performance. This problem is ubiquitous across scientific fields and unaffected by impact factor, rigor-promoting policies, or open science practices. Simulations across 420 scenarios spanning four diverse datasets show that ignoring fold dependence leads to invalid false positive control in most settings. Repeated cross-validation further compounds this problem, with false positive rates rising toward 100% as the number of repetitions grows. Existing fold-dependence-aware tests rely on strong assumptions because the variance of fold-level statistics and the between-fold correlation cannot be disentangled under standard cross-validation. We therefore propose the SHARP (Split-HAlf RePeated) test, a simple modification to standard cross-validation that enables direct estimation of variance and correlation. Benchmarked against 11 tests, SHARP provides the best overall balance of false-positive control, statistical power, and confidence-interval calibration across simulation schemes. We conclude by providing best practices and reporting guidelines for valid model comparison inference in biomedical machine learning and beyond.
Elizabeth DuPre
Improving data reuse in neuroimaging with domain adaptation
Data sharing plays an important role in modern neuroimaging research, with platforms such as OpenNeuro and EBRAINS democratizing data access. Importantly, the small sample sizes, heterogenous acquisition parameters, and variable annotations that are typical of neuroimaging studies continue to challenge their large-scale re-use, even with modern AI methods. In this talk, I propose to recast this challenge of data re-use as a problem of domain adaptation. With special attention to the problem of inter-individual variability, I will present our work developing a methodological framework—and associated computational tools—for addressing domain adaptation with predictive models for neuroimaging data. I will further outline why dense-sampling datasets provide an ideal environment to evaluate these methods.
Johanna Bayer
Longitudinal normative modelling (across multiple time points)
Brain charts have emerged as a highly useful approach for understanding brain development and aging on the basis of brain imaging and have shown substantial utility in describing typical and atypical brain development with respect to a given reference model. However, existing models are cross-sectional and cannot capture change over time at the individual level. In this talk, we introduce velocity centiles (longitudinal normative models), which directly map change over time and can be overlaid onto cross-sectionally derived population centiles. We demonstrate this by modelling rates of change for 24k scans from 11k healthy individuals with up to 8 longitudinal measurements across the lifespan. We provide a method to detect individual deviations from a stable trajectory, generalising the notion of thrive lines, which are used in pediatric medicine to declare failure to thrive. Using this approach, we predict transition from mild cognitive impairment to dementia more accurately than by using either time point alone, replicated across two datasets. Last, we show how this framework can account for multiple time points, and how integrating person’s history can improve the sensitivity of velocity models for predicting the future trajectory of brain change. This highlights the value of predicting change over time and makes a fundamental step towards precision medicine.
Camille Maumet
On the impact of analytical variability on neuroimaging data reuse
"In neuroimaging and functional Magnetic Resonance Imaging (fMRI), many derived data are made openly available in public databases. These can be re-used to increase sample sizes in studies and thus, improve robustness. In fMRI studies, raw data are first preprocessed using a given analysis pipeline to obtain subject-level contrast maps, which are then combined into a group analysis. Typically, the subject-level analysis pipeline is identical for all participants. However, derived data shared on public databases often come from different workflows, which can lead to different results. Here, we investigate how this analytical variability, if not accounted for, can induce false positive detections in mega-analyses combining subject-level contrast maps processed with different pipelines. We use the HCP multi-pipeline dataset, containing contrast maps for N=1,080 participants of the HCP Young-Adult dataset, whose raw data were processed and analyzed with 24 different pipelines. We performed between-groups analyses with contrast maps from different pipelines in each group and estimated the rates of pipeline-induced detections. We show that, if not accounted for, analytical variability can lead to inflated false positive rates in studies combining data from different pipelines.
Damien Wasserman
Linking neuroimaging with cognition the good, the bad, and the not so good
Predicting cognitive traits and states from functional and diffusion MRI is central to understanding how much MRI can actually tell us about the brain — yet the field's go-to tools don't always deliver what their complexity promises. I'll examine what actually drives predictive performance for cognitive labels: which signal components carry information about cognition, and which popular representations — including large pretrained brain foundation models — fail to preserve them. I'll introduce several approaches for distilling informative structure from fMRI and diffusion MRI signals, and discuss how they connect to our broader goal: understanding the link between cognition and brain structure.
Alejandro de la Vega
NiCLIP: Neuroimaging contrastive language-image pretraining model for predicting text from brain activation images
We present NiCLIP, a contrastive language–image model for predicting cognitive tasks, concepts, and domains from brain activation patterns. Trained on more than 23,000 neuroscience articles, NiCLIP improves functional decoding by combining large language models with text-to-brain alignment, outperforming baseline LLM approaches and benefiting from full-text articles and curated cognitive ontologies. It accurately decodes group-level activation maps across multiple Human Connectome Project domains and helps characterize the functional roles of key brain regions.
Bertrand Thirion
Variable importance analysis for black box predictive models & application to brain imaging
Thanks to large population-level and deep datasets, as well as the development of foundation models, the use of machine learning tools for prediction based on brain imaging has increased in complexity in recent years. One ongoing issue is the lack of explicability of these models. In this study, we focus on measuring the contribution of individual variables to the prediction of a black-box model. We discuss the properties that variable importance estimators should possess and present practical algorithms. We consider statistical control, power, and computational efficiency and demonstrate our findings using brain imaging data. Finally, we outline pending challenges.
Rezvan Farahibozorg
Probabilistic Functional Modes: A Bayesian framework for modelling brain networks in populations and individuals
The human brain is a system of networks, each underlying a specific function, and interacting with each other. Brain networks at rest; also known as resting state networks (RSNs), have been very influential in characterisation of the functional organisation of the brain. The framework of Probabilistic Functional Modes (PROFUMO) is a Bayesian model, designed to estimate RSNs for populations and every individual simultaneously. As a result, the model can capture individual-specific characteristics in brain function, beyond what has been possible using standard group-average-based techniques. In this talk, I will first describe this modelling framework and its various extensions. I will then show some of its applications to large fMRI datasets such as UK Biobank, including the characterisation of multiscale brain modes and potential for subgroup discovery.
Dylan Nielson
Interpretable factorization of clinical questionnaires
One of the major goals of neuroimaging has been to understand the biological basis of psychiatric disorders. As much as we focus on improving neuroimaging, progress will also depend on having high quality measures of psychopathology. Psychopathology is typically measured through questionnaire scales or subscales. The creation and validation of these scales has canonically relied on factor analysis, but the resulting factors are not guaranteed to be interpretable, and are subject to confounding effects. Additionally, missing data is a common problem in the large datasets necessary for discovery or validation of latent factors of psychopathology. The use of factor analysis therefore requires some form of imputation. We overcome these limitations with a non-negative matrix factorization tailored for questionnaire data, Interpretability Constrained Questionnaire Factorization (ICQF). This method promotes factor interpretability by identifying a sparsely defined set of factors while constraining both weights and loadings to be between 0 and 1. Non-negativity means that factors are strictly additive, preventing the interpretational difficulties posed by cancellation of effects with negative weights. Incorporating a masking matrix allows us to handle missing data (random or non-random) without a separate imputation step. Our optimization procedure has theoretical convergence guarantees, and we have an automated procedure to determine latent dimensionality. We have validated these procedures in realistic synthetic data, as well as in the Healthy Brain Network and Adolescent Brain and Cognitive Development studies. ICQF provides more interpretable factors (as evaluated by domain experts) while preserving diagnostic information across a range of disorders, and outperforming competing methods in smaller datasets.
Mandy Mejia
Data-driven location- and subject-specific hemodynamic response functions for task fMRI
Task fMRI activation studies typically employ a fixed canonical hemodynamic response function (HRF) to characterize the blood oxygenation level dependent (BOLD) response to task-evoked neural activity. However, there is known variation in the shape of the HRF across brain regions, stimulus types, and individuals. Failing to account for deviations from the canonical form leads to mismodeling, which can cause attenuation of activation amplitudes and reduced power to detect activations. Existing approaches to account for HRF heterogeneity have limitations: inclusion of HRF derivatives allows for only small deviations from the canonical HRF, while finite impulse response and spline models are more flexible but suffer from reduced interpretability and efficiency. Here, we present a data-driven framework for learning location- and subject-specific HRFs for a given task. Given a specific parametric form, such as a difference of two Gamma functions, we learn optimal HRF parameters for each location and subject, constrained within a set of physiologically plausible HRF shapes, employing regularization to avoid overfitting. The learned HRFs can be employed in place of the canonical HRF in a conventional task fMRI GLM. While a large training sample is necessary to obtain reliable location-specific HRFs, once learned can be adopted in smaller studies employing similar task paradigms. We validate this approach using multiple tasks from the Human Connectome Project, including block and event-related designs. We find that the learned HRFs result in stronger, more specific activation amplitudes, illustrating the ability of this approach to improve sensitivity, specificity, and power in task fMRI studies.
Sara Wesolek
Somatosensory Evoked BOLD-Signals with Ultra-High Temporal Resolution
Although the spatial resolution of functional magnetic resonance imaging (fMRI) continues to improve, relatively few methods aim to enhance or leverage temporal resolution to investigate the spatiotemporal dynamics of the oxygen level-dependent (BOLD) signal. In our work, we employ a reordering approach to achieve ultra-high temporal resolution (60 ms) in fMRI data acquired during somatosensory stimulation paradigms. We then apply finite impulse response models and group-level ANOVAs to preserve temporal dynamics throughout the statistical analysis. To correct for multiple comparisons, we use 4D nonparametric permutation testing to identify significant signal changes over time across the whole brain. This method introduces a time-resolved approach to BOLD signal analysis, drawing inspiration from grand-average techniques commonly used in EEG research.
Gabriella Chan
Brain Networks and Gene Expression Jointly Shape Grey Matter Changes in Schizophrenia
Schizophrenia is associated with diffuse grey matter volume (GMV) changes across the whole brain which spread along white matter connections. Here, we present a model simulating how pathological processes emerge in vulnerable regions and spread across brain networks to model patterns of GMV change observed in patients.
Our approach accurately recapitulates empirical spatial maps of atrophy across a large multi-site cohort (r=0.715), demonstrating that disease progression is jointly shaped by genetic susceptibility and network architecture. Candidate genes identified by our framework show enrichment in independent genome-wide association and transcriptomic patient datasets, indicating clinically relevant biological processes are shaping GMV change. We also identify epicentres of structural change in the hippocampus and temporal lobes highlighting potential targets for therapeutic intervention.
Julia-Katharina Pfarr
Retrospective harmonization of neuropsychiatric questionnaires using expert mappings, semantic textual similarity, and psychometric validation techniques
Diversity in the design of clinical assessment instruments creates fundamental incompatibilities when attempting to use them in a retrospective collaborative research setting (retrospective multi-site consortia, machine learning analyses, federated learning settings etc.). Previous harmonization approaches for questionnaire data have primarily relied on psychometric linking methods such as Item Response Theory (IRT) or Principal Component Analysis (PCA). While valuable, these methods require overlapping response data between questionnaires, which is often unavailable in retrospective multi-site analyses. This study proposes a structured approach to map individual questionnaire items from multiple instruments to pre-defined symptom dimensions using expert consensus and computational semantic similarity, without requiring overlapping response data between questionnaires. Dimension scores are subsequently transformed to allow comparability across different clinical instruments.
Mehul Gajwani
Correcting for systematic biases in vertex-level spatial data
Spatial statistics are ubiquitous in neuroimaging analyses, yet conventional methods neglect the effects of irregular sampling of the brain. Variations in vertex areas (i.e., larger vertices at gyral crowns and smaller vertices at sulcal fundi) bias the calculation of statistical features such as parcel means and correlations between maps. Here, we introduce a statistical framework for comparing maps that incorporates vertex areas. Analytical and empirical experiments show that area-corrected measures are more accurate than area-naïve measures, changing Pearson correlations between cortical maps by up to 0.1. Area-correction also increases the fidelity of parcellations, where parcel means are no longer dominated by supernumerary small vertices. Implementations for several area-corrected statistics can be found in our Python package neuromodes.
Lea Waller
Wonkyconn: data-driven analytic choices in multiverse analyses
Multiverse analyses are one approach for addressing analytic flexibility, as they allow researchers to systematically explore the effects of different analytic choices on study outcomes. However, dissimilar or contradictory results across the multiverse can be difficult to reconcile. Wonkyconn helps researchers identify the most appropriate processing pipeline for any given research question and dataset using benchmark metrics. These include dataset-level metrics of subject motion, the identifiability of brain networks and cortical gradients, and the ability to classify age and sex/gender. Wonkyconn is provided as a BIDS-app that can process BEP017 and HALFpipe outputs to generate an average ranking of processing pipelines across metrics, enabling researchers to report results from multiverse studies using data-driven criteria.
Yu-Ping Wang
From linear to deep CCA and beyond
The interactive analysis between multi-modal brain imaging and multi-omics continues to become an important challenge and the interaction patterns can serve as an important mechanism underlying brain development and function. In this talk, I will first review our efforts in developing a variety of canonical correlation analysis (CCA) based models motivated by different application tasks such as the class-specific CCA to extract both shared and specific correlated patterns between brain imaging and genomics data. Then I will move from linear CCA models to deep network-based version of the models to extract complex interactions both within and between modalities. Finally, I will show how deep CCA models can be further incorporated into regression model to link with cognitive behaviors for comprehensive analysis. Some application examples including improved diagnosis of schizophrenia and the prediction of IQ will be demonstrated.
Valentina Pacella
Neuroimaging statistics for cognitive neuropsychology: from clinical observation to interpretable brain–behaviour models
Cognitive neuropsychology has historically relied on the careful interpretation of clinical dissociations to infer the organisation of cognitive systems. Neuroimaging and statistical modelling now provide the opportunity to extend this logic, allowing clinical observations to be tested against quantitative models of brain structure and function. In this talk, I will discuss how different statistical approaches can support this transition, from Bayesian modelling of clinical syndromes and single-case inference to dimensionality reduction and probabilistic modelling of large-scale neuroimaging data. Across examples including motor awareness deficits, apraxia, and meta-analytic models of cognitive organisation, the focus will be on how statistical tools can help formalise neuropsychological questions and generate interpretable mappings between behaviour and brain organisation. Rather than treating neuroimaging statistics as an end in itself, I will argue for its role as an inferential bridge: linking individual patients to normative populations, focal symptoms to distributed networks, and classical cognitive constructs to representations of brain function.
James Pang
Linking brain structure and function through computational modelling: Advantages and pitfalls
Understanding how brain structure gives rise to large-scale neural dynamics remains a central challenge in neuroscience. Computational models provide a powerful framework for linking structural features of the brain, such as geometry and connectivity, to observable patterns of activity measured with multimodal neuroimaging. In this talk, I will discuss how whole-brain models are constructed, the assumptions they rely on, and what they have revealed about the relationship between structure and function in both human and non-human brains. I will provide examples of how modelling can move beyond correlation to provide mechanistic explanations. I will also present recent work from our team showing how structural constraints, such as cortical geometry and connectivity, shape large-scale wave-like brain dynamics. I will conclude by discussing both the opportunities and pitfalls of current modelling approaches, and providing an outlook on how integrating multiscale biological data may help advance the next generation of computational brain models.