A comparative study on audiovisual perception of final boundaries by Chinese and English observers

Author(s): Ran Bi and Marc Swerts


It has been suggested that conversation partners use and interpret both auditory and visual features as markers of the end of an utterance. Previous work on languages like Dutch and English have shown that speakers and listeners rely on prosodic cues such as boundary tones and variation in eye gaze behavior to pre-signal finality in an utterance. However, little is known about how listeners of different linguistic backgrounds (Chinese and English), when perceiving utterance-finality, make use of these auditory and visual cues as used by speakers of these languages, whether these cues and their use are language-specific. Using naturally elicited stimuli from Chinese and English speakers, this study conducted a perception experiment to measure both Chinese and English participants’ reaction time and accuracy in a task of judging whether a speech fragment occurred in utterance-final position or not. The participants were exposed to the same stimuli in three formats: audio-only, vision-only and audiovisual. Results revealed that audiovisual stimuli contributed most in both languages, and showed correlations between the two dependent variables (reaction time and accuracy). Additionally, English and Chinese stimuli differed in how easily and accurately they could be judged by observers from both language groups.


