(Brain) Measurements and the Need for Inference

Neuroimaging techniques, such as functional magnetic resonance imaging (fMRI), have revolutionised the field of neuroscience. They provide non-invasive, or minimally invasive, regionally-specific, in vivo measurements of the brain activity [6] [10]. However, these measurements are indirect observations of the underlying neural processes. Consequently, there is a need for mathematical models, linking the observations and domain-specific variables of interest and algorithms for making inferences, given the data and the models. Currently, these models are sometimes based on the physical generative process, but more often on statistical principles (e.g. identifiability of the unknowns, existence of an unbiased estimate, stability of the estimates in terms of variance, robustness, etc).

From the outset, classical statistical approaches were used for neuroimage analysis. Largely based on known models and data collected on well-controlled and highly-constrained experimental conditions, the approach offered good interpretability (because of simpler models), the possibility to examine strength of the analysis using well-established hypothesis tests, and a means to determine the functional specialisation of the human brain [8].

However, the brain and its associated biochemical, physiological and cognitive processes is just too complex for the researcher to manually pose an a-priori “correct” model. It has increasingly become clear that constructing statistical models focusing primarily on the identification of brain regions involved in certain functional tasks cannot provide a complete picture. Furthermore, the approach did not allow for changes in the environment (‘stimuli’ or habituation). In contrast to the classical statistical approach, the emphasis of the machine learning approach is learning the model space, i.e. the “hypothesis” itself, from the data. The mod- els are typically abstract and generic, to widen the model space, and the learning algorithms are generally tuned to obtain optimal prediction performance. An important goal of modern machine learning is often to construct highly efficient (fast) and automated algorithms.

In summary, there are two strands of researchers. One group uses classical statistical tools and is interested in data interpretation and understanding/uncovering the generative neural processes while the other group focuses on prediction using only a generic set of model assumptions and employs data-driven methods.

The Debate

The concerns to data-driven methods voiced by practitioners of classical statistical methods include

  • The models’ lack of interpretability and meaningfulness,
  • The requirement of permutation-based hypothesis testing to test the strength
    of the predictions, 
  • Non-existent power calculation to design future experiments,
  • Poor ability to replicate across trials, subjects, tasks, and studies, and
  • Lacking attention to the small n–large p constraint in neuroscience.

Conversely, users of machine learning methods point to

  • The inappropriateness of the restrictive choice of models for biological data,
  • The lack of repeatability of “significant” findings,
  • The post-hoc corrections for multiple hypothesis tests, and
  • The arbitrariness of significance levels.

There is little awareness of the fact that the parties at the heart of this de- bate each represent an aspect of the scientific method: hypothesis generation, represented by machine learning, on the one hand, and hypothesis validation, rep- resented by classical statistics, on the other. We argue that the voiced objections arise due to a lack of frameworks that adequately combine the two aspects of neuroscience methods research. We propose that both approaches can be brought together through advances in both computational statistics and machine learning theory.