In this section, I provide accessible summaries of my work. This section is still under construction, therefore please reach out to me in case you are interested in a lay abstract that is not yet present.
Disclaimer: Note that, contrary to the official abstracts that are part of the published articles, I am the sole author of these lay abstracts. Accordingly, views and opinions here expressed are my own only and do not necessarily reflect those of the co-authors or funding agencies.
The timing of an avatar’s beat gestures biases lexical stress perception in vocoded speech
Authors: Matteo Maran, Renske Uilenreef, Roos Rossen, Hans Rutger Bosker
Year of publication: 2025 (accepted)
Link to the pre-print: https://doi.org/10.31234/osf.io/nvh36_v2
Cochlear implants (CIs) are neural prostheses that can restore some level of hearing capacity. However, CIs convey a signal that distorts some aspects of speech, for example the intonation (“fundamental frequency”) that provides the emphasis to distinguish between two otherwise identical words (e.g., the noun “CONtent” vs. the adjective “conTENT”, which differ in the so-called “lexical stress”). Recent studies show that simple up-and-down flick of the hands, called “beat gestures”, affect lexical stress: a beat gesture falling on the first syllable of a word (e.g., “content”) will make it more prominent, making listeners more likely to hear the word with emphasis on the first syllable (e.g., hearing the noun “CONtent”, rather than the adjective “conTENT”). Similarly, a beat gesture falling on the second syllable will provide it with more emphasis (e.g., hearing the adjective “conTENT”, rather than the noun “CONtent”). In this online experiment, we tested whether beat gestures can affect the perceived emphasis of words once access to the fundamental frequency information is limited (“vocoded speech”), in a way that might resemble hearing through CIs. Clear speech served as a baseline condition for comparison. Additionally, the beat gestures were made by a human-looking avatar, which could in principle be used in future app-based software to support hearing through CIs. Beat gesture affected the perceived emphasis of words in vocoded speech, especially when the information in the speech signal was ambiguous (e.g., half-way between CONtent and conTENT) or infrequent (e.g., emphasis on the second syllable, which in the tested language occurs only 12% of the time). In these cases, the effect of beat gestures in vocoded speech were numerically larger than in clear speech. Overall, this study suggests that avatar-made beat gestures might be relevant to facilitate speech processing, especially when the speech input is distorted.
_______________________________________________
Beat gestures facilitate lexical access in constraining sentence contexts
Authors: Ronny Bujok, Matteo Maran, Antje S. Meyer, Hans Rutger Bosker
Year of publication: 2025 (accepted)
Link to the pre-print: https://doi.org/10.31234/osf.io/8ntjm_v1
Beat gestures are are simple up-and-down flicks of the hand, usually produced together with prominent parts of a sentence. Previous studies (see "How to test gesture-speech integration in ten minutes") showed that beat gestures can help distinguishing between words that differ only for the position of the prominence (e.g., the noun "CONtent" vs. the adjective "conTENT"). However, these studies only investigated the effect of beat gestures on isolated words (i.e., words out of context, presented alone). Furthermore, these previous studies examined the effect of beat gestures only on those words that differ for the position of prominence (e.g., "CONtent" vs. "conTENT"), which are not the majority of the words we encounter in our everyday communication. In this study we investigated how beat gestures affect the comprehension of words that 1) do not have a counterpart differing for prominence (e.g., "ARmy", for which "arMY" does not exist) and 2) are presented as part of a full sentence (e.g., "He became a colonel in the army"). In this study, participants watched videos of a speaker producing a sentence and indicated if the final word (i.e., the target) was a real word (e.g., "army") or not ("arfy"). The target could be presented without a beat gesture, or with a beat gesture, which could align to either the prominent syllable (e.g., "AR" in "ARmy") or not (e.g., "my", in "ARmy"). Beat gestures facilitated the recognition the target, independently from their alignment with the prominent syllable. The present results suggest that beat gestures affect the earliest stages of word recognition in speech comprehension.
_______________________________________________
Beat gestures made by human-like avatars affect speech perception
Authors: Matteo Maran, Renske Rötjes, Anna R. E. Schreurs, Hans Rutger Bosker
Year of publication: 2025
Link to the article: https://www.isca-archive.org/interspeech_2025/maran25_interspeech.html
When someone is talking to us face-to-face, we can not only hear what they say, but also see their hands’ moving. The hands of the speakers often produce so-called “beat gestures”, which are up-and-down strokes of the hand. Beat gestures can mark the importance of a word, making it “stand out” from the rest of a sentence (e.g., “The CONTENT of his speech, not the form, was very good”, with the uppercase font indicating higher prominence). Interestingly, beat gestures do something similar even within a word, giving more emphasis to one of its syllables (e.g., the noun “CONtent” vs. the adjective “conTENT”). In particular, if the speaker produces a beat gesture during the first syllable, listeners are more likely to perceive it with emphasis (e.g., hearing the noun “CONtent”, rather than the adjective “conTENT”). Similarly, if the hand gesture falls on the second syllable, this syllable is perceived as more prominent (e.g., hearing the adjective “conTENT”, rather than the noun “CONtent”). In this experiment we investigated if the effect of beat gestures on the perceived emphasis of syllables within words is specific to gestures produced by human or human-looking interlocutors. In the first condition, a word (e.g., “content”) was produced while a beat gesture was made by a human. In the second condition, the gestures were instead made by a human-looking avatar. In the third condition, a disc moved with the same trajectory and timing as the human hands. Participants indicated which word they heard (e.g., “CONtent” or “conTENT”). Human and avatar beat gestures affected the emphasis that participants perceived in the word, but the movement of the disc condition failed to do so, despite having the same trajectory and timing of a beat gesture. Furthermore, human gestures had a larger effect compared to the avatar’s ones. Overall, these results suggest that there might be something “special” about gestures made by a human or human-looking speaker. It is possible that in conversations listeners take into account specifically movements made by a potential speaker (e.g., a human or an avatar), but not by an implausible one (e.g., a moving disc).
_______________________________________________
How to test gesture-speech integration in ten minutes
Authors: Matteo Maran, Hans Rutger Bosker
Year of publication: 2024
Link to the article: https://doi.org/10.21437/SpeechProsody.2024-149
When we engage in conversation, the hand gestures made by our interlocutor affect what we hear. Beat gestures are the most common type of hand movement produced by speakers, and consist in simple up-and-down strokes of the hand. Beat gestures can draw attention to specific words in a sentence, making them more prominent (e.g., “The CONTENT of his speech, not the form, was very good.”). They can also give more emphasis to specific syllables of a word, helping to distinguish between similar words (e.g., the noun “CONtent” vs. adjective “conTENT”). The effect of beat gestures on the prominence of syllables has been investigated in several studies, which were however quite long (taking approximately 1 hour). In this online experiment, we tested whether a “mini-test”, taking approximately 10 minutes, would also be capable to show how beat gestures affect the perceived prominence within a word (e.g., the noun “CONtent” vs. adjective “conTENT”). We tested this effect with two different response procedures. In the first one (“two-alternative forced choice” - 2AFC), participants had to select one out of two options to indicate what they heard (i.e., pressing one of the two buttons to indicate that they heard “CONtent” or “conTENT”). In the second one (“visual analog scale” - VAS), participants could provide a graded response by sliding a dot along a line between two extremes (e.g., moving the dot all the way to the left if they were 100% sure that they heard the noun “CONtent”). Both response procedures successfully demonstrated the effect of beat gestures on speech perception. Overall, this experiment demonstrated that the effect of beat gestures on speech perception can be shown with short online experimental sessions, adopting one of two different response modalities.
_______________________________________________
Investigating the neurophysiological correlates of syntactic processing in a visual masked priming paradigm
Authors: Elena Pyatigorskaya, Angela D. Friederici, Emiliano Zaccarella, Matteo Maran
Year of publication: 2025
Link to the article: https://doi.org/10.1080/23273798.2025.2488049
Several studies have shown that the human brain can automatically (i.e., without direct control) detect errors in the grammatical information of a spoken sentence (e.g., “He month” vs. “A month”). The present study employs Electroencephalography (EEG) to investigate if a similar effect can be observed encountering errors during reading. In this study, participants read two words, which could form a grammatical (e.g., “A month”) or ungrammatical (e.g., “He month”) sequence. In one condition, the first word was clearly visible (i.e., “unmasked”), while in the second condition the first word was presented in a subliminal way (i.e., “masked”) and therefore could only be processed automatically. Interestingly, ungrammatical sequences in both the masked and unmasked conditions elicited a similar effect in the EEG signal (i.e., P600), showing that the human brain detected an error. The fact that this effect is present also in the unmasked condition suggests that it stems from an automatic analysis of the utterance. Overall, this study showed that the human brain automatically processes grammatical information when reading.
_______________________________________________
Online neurostimulation of Broca’s area does not interfere with syntactic predictions: A combined TMS-EEG approach to basic linguistic combination
Authors: Matteo Maran, Ole Numssen, Gesa Hartwigsen, Emiliano Zaccarella
Year of publication: 2022
Link to the article: https://doi.org/10.3389/fpsyg.2022.968836
When someone speaks to us, our brain analyses the grammatical well-formedness of a sentence in approximately 200 milliseconds. It has been suggested that such a fast pace of analysis stems from the fact that grammatical predictions (e.g., expecting a noun after hearing “the”) facilitate speech comprehension. Broca’s area, which is located in the left anterior part of the human brain, has been proposed as the brain region involved in generating grammatical predictions. In this experiment, we tested whether Broca’s is involved in predicting upcoming grammatical categories. Using “Transcranial Magnetic Stimulation” (TMS) we temporarily disrupted in a safe, controlled, and non-invasive way the functioning of Broca’s area when it is supposed to generate predictions about upcoming words (e.g., during “the”, when one can predict that a noun will occur). Simultaneously, we used “Electroencephalography” (EEG) to record brain activity with electrodes and detect whether the processing of an upcoming word (e.g., a noun, or a verb) was altered by TMS. Contrary to our expectations, TMS over Broca’s area at the time of prediction “”the”) did not alter the processing of upcoming grammatical information (e.g., a noun). Overall, our study indicates that Broca’s area might be involved in checking whether grammatical errors occur, rather than predicting upcoming grammatical information.