Why do people enjoy music? One possibility is because music taps into our intrinsic propensity to form predictions about our external environment. Predictive coding is an influential framework which explains neural information processing in terms of minimising the long-term discrepancy between expected and actual incoming information. My research here is about understanding why and how expectations shape musical pleasure.
In Uncertainty and Surprise Jointly Predict Musical Pleasure and Amygdala, Hippocampus, and Auditory Cortex Activity , we used an unsupervised statistical learning model to analyse the expectancy of over 80,000 pop song chords and showed that musical pleasure is modulated by the interaction between how much we can predict an upcoming chord, and the extent our expectations are violated. This is in line with Predictive Coding, and we further show that expectancy-driven musical pleasure is related to activity in brain regions processing emotion and auditory information.
Although our previous work demonstrated the role of expectancy in shaping our appreciation for music, how listeners formed musical expectations remained unclear. We addressed this in Distinct roles of cognitive and sensory information in musical expectancy, where we compared various computational models of musical expectancy using Bayesian methods. We found that listeners formed musical expectations based on both short-term sensory mechanisms and long-term representations of music structure, although with a larger contribution from the latter.
The goal of brain decoding is to infer information about cognitive states and perceived stimulus information directly from subjects' neural activity. Functional magnetic resonance imaging (fMRI) allows us to measure neural activation non-invasively with excellent spatial resolution. I develop and apply methods to decode music-related information from neural activity as recorded using fMRI.
In Decoding musical pitch from human brain activity with automatic voxel-wise whole-brain fMRI feature selection, we proposed a new two-stage thresholding method that automatically selects maximally relevant voxels in the brain for decoding tasks. Our method is interpretable on the voxel-level, and capitalises on local functional organisation whilst enabling feature selection across the whole-brain. We applied our method on an fMRI dataset where listeners heard single instrument notes, and demonstrated a two-fold improvement in pitch-height decoding when using features selected with our method compared to the traditional regions of interest approach.
One advantage of studying brain function with multivariate patterns of neural activation compared to typical mass-univariate approaches is enhanced sensitivity. In Neocortical substrates of feelings evoked with music in the ACC, insula, and somatosensory cortex, we showed that neural activation in the cingulate cortex, anterior insula, and somatosensory cortex also encode information about emotions evoked by natural musical stimuli.
Music recommendation algorithms typically rely on aggregated user information to generate future recommendations. Consequently, most systems are insensitive towards to users' current affective states. My goal is to build biosignal-inspired systems that can overcome this barrier by flexibly adapting its recommendations based on affective states inferred from users' neurophysiological signals in real-time.
We provided the groundwork towards biosignal-inspired music recommendation in Rapidly Predicting Music Artistic Expression Preference From Heart Rate and Respiration Rate. By interpreting predictions from a gradient-boosting model, we inferred non-linear relationships between music expression preference and respiration and heart rate.
Finding the optimal string, position, and finger combination to play each note on the violin is crucial for effective emotional expression and successful performance. However, most violin sheet music contain minimal or no fingering information, making it difficult for beginners to learn independently. This research theme is about automatically generating violin fingerings based on timing and pitch information found in every sheet music.
In Semi-supervised Violin Fingering Generation Using Variational Autoencoders, we proposed a variational encoder model that learns to generate violin fingering information from both labelled and unlabelled data. The key insight is to treat missing fingerings as an additional latent variable following a Gumbel Softmax distribution that could be used by the decoder for reconstructing pitch and timing information. We showed that our model successfully replicated the style of a human performer, and achieved state-of-the-art performance with only half the amount of labelled data.
In An Interactive Automatic Violin Fingering Recommendation Interface, we provided a simple Python-based GUI that enables users without programming skills to upload and annotate violin sheet music saved in MusicXML. The interface allows both customised and automatically generated fingerings to cater for the unique needs of each musician. Scores with annotated fingerings can then be exported into PDF files.