Projects

EXAMPLES OF CURRENT PROJECTS

Project #1: NEURAL Coding OF AUDITORY TIMBRE AND IMPACT OF Cochlear Synaptopathy  

INSPECTSYN: Speech-in-noise Intelligibility Deficits: Designing Psychophysical and Electrophysiological Markers of Spectral Shape-Coding Sensitive to Cochlear Synaptopathy

Project funded by the ANR (ANR JCJC, 2023-2026)

In collaboration with Christian Lorenzi (ENS), Laurel Carney (U. Rochester) and Paul Avan (Institut de l’Audition).

NEWS: We have a 2-year funded postdoctoral position available for this project! full details here: full postdoc offer

As we age, almost all of us will complain about increased difficulty to communicate in noisy environments. An important issue is why some individuals, even without measurable loss in audibility, experience more difficulties than others.  At present, it is estimated that over 10% of individuals showing significant difficulty for understanding speech-in-noise (SPiN) have actually clinically-normal audiograms. In particular, synaptopathy - the loss of synapses connecting the cochlea to the auditory nerve, caused by aging or noise exposure, is thought to be an important factor contributing to this problem. Yet, recent attempts to assess synaptopathy in humans have produced mixed results. These works were mostly focused on the impairment caused on temporal envelope coding


Strength of FFRs (TFS) reflecting the neural coding fidelity of complex tones measured in different groups of subjects young/old, with/without hearing loss, suggesting that the coding of the temporal fine structure of such sounds is degraded in these populations.

In a previous project funded by the Fondation pour l'Audition and conducted in collaboration with Sarah Verhulst (Ghent University), we have collected frequency following responses (FFRs) to complex tones in various groups of listeners and found that the strength of responses were significantly reduced with both age and hearing loss (even in frequency regions where the audiogram showed no impairment), suggesting that the FFR could provide a novel perspective to monitor temporal coding deficits. Yet, the extent to which FFRs reflect the coding of the spectral characteristics of sounds, through its temporal fine structure (TFS), needs to be further addressed.

Although previous works showed the pivotal role of TFS for SPiN understanding, it is currently unclear how synaptopathy impairs the neural mechanisms coding for the spectral shape of sounds. The INSPECTSYN project will address this fundamental question by following an integrated multidisciplinary approach combining computational modeling, psychophysics and electrophysiology. In particular, we will examine how FFRs reflect the amount of synaptopathy in an impaired ear and use this proxy to provide an evaluation, through a novel perspective, of the contribution of synaptopathy to SPiN deficits.

Project #2: development of speech processing across the lifespan: COMBINING EEG AND computational models  

HEARDEVCOMP project, funded by the CNRS (MITI)

In collaboration with Laurianne Cabrera (CNRS / Univ. Paris Cité)

Human language acquisition is based on the development of neural processes for selectively extracting and processing acoustic parameters of speech signals, in particular amplitude and frequency modulations (AM-FM). Previous studies show that auditory processing of these acoustic cues is already in place at birth, but remains inefficient until adulthood. In particular, young children show more difficulty than adults in perceiving speech in a noisy environment. Yet, the developmental stages of this speech processing efficiency are still scarce. 

In this new interdisciplinary research project at the interface between biological and computational sciences conducted in collaboration between the INCC and STMS laboratories, we are exploring a new combination of electrophysiological measurements (EEG) and computational models of the auditory periphery and the midbrain to translate the empirical changes observed into physiologically plausible, latent parameters. These results should provide further information regarding the origins of auditory development and associated perceptual changes observed from birth to adulthood. 

Project #3: Psychophysical encoding of spectro-temporal modulations and speech-in-noise perception 

Speech signals carry integrated temporal and spectral variations (e.g., through formants), which can be best modeled using spectro-temporal modulations (see panel A below). There is a growing interest in psychoacoustics in using synthetic spectro-temporal modulations (STM) to address speech processing and its disorders (e.g. Bernstein et al., Trends Hear., 2016). 

Although the way the auditory system integrates these dimensions is crucial for understanding the processing of speech and other biological sounds, the underlying mechanisms have received little attention: most previous works addressed the mechanisms underlying the detection of these features at threshold. Although the discrimination of speech features is a supra-threshold process, almost nothing is known regarding the capacities of the auditory system to discriminate STMs along the temporal, spectral or spectro-temporal dimensions (see panel B above). The aim of this project here is three-fold: (a) to develop novel psychometric tools for characterizing the mechanisms engaged in supra-threshold STM processing in different individuals, with / without hearing loss, (b) to capitalize on current auditory models to see whether / how these mechanisms need to / are implemented, and (c) to assess their role for speech-in-noise perception. Two main axes will be specifically followed:

(1) Auditory capacities for discriminating formant trajectories 

(2) Perceptual sensitivity to modulation phase 

Project #4: Computational models of social / emotional judgments from speech prosody

Beyond words, speech carries a lot of information about a speaker through its prosodic structure. Humans have developed a remarkable ability to infer others’ states and attitudes from the temporal dynamics of the different dimensions of speech prosody (i.e. pitch, intensity, timbre, rhythm). However, we still do not have a computational understanding of how high-level social or emotional impressions are built from these low-level dimensions. We recently developed a data-driven approach combining voice-processing techniques (using a specifically-designed audio software) and psychophysical reverse-correlation to expose the mental representations or ‘prototypes’ that underlie such inferences in speech. 

By deploying this approach to assess how intonation drives social traits in speech, we have been able to demonstrate the existence of robust and shared mental representations of trustworthiness and dominance of a speaker’s voice (see Ponsot et al., PNAS, 2018). This approach offers a principled way to expose the prototypes that underlie any high-level inferences from others’ speech or from music signals, and therefore holds promise for understanding why certain emotional or social judgments differ across cultures and why these inferences may be impaired in individuals suffering from specific pathologies or neurological disorders.


An illustration of the reverse-correlation approach deployed to derive the mental pitch contour of an interrogative voice (top) or the timbre of a smiling voice (bottow) , The approach can be deployed similarly to characterize the speech characteristics driving any social or emotional judgments

 In particular, we are currently running experiments to characterize prosody processing deficits in patients after stroke. One motivating perspective to the deployment of such approach in clinical contexts is to provide further characterization of impairments, hopefully leading in the long term to the development of novel individualized rehabilitation strategies. Yet, further theoretical work on the approach is also needed. For instance, the reverse-correlation approach as considered here relies on the assumption that the algorithm the brain uses to make these inferences from speech prosody could be approximated by template-matching, i.e. a linear comparison of the input stimulus with one single prototype. This is a strong assumption that needs further work to understand in which conditions it can reasonably be applied.