🗣️ Dr Matheiu Beaudouin (University of Cambridge)
📅 26th Feb, 2026 (16:00 - 16:30)
🏫 LB3, Sidgwick Site
Abstract:
Tangut, the language of the rulers of the Western Xia Empire (1038–1227), plays a central role in the historical phonology of the Sino-Tibetan family. It is both the third earliest literary language of the family, after Chinese and Tibetan, and a member of one of its most conservative branches, the Gyalrongic group.
Its phonology is primarily category-based: like Middle Chinese, it is mainly known through rhyme dictionaries, which provide reliable access to phonological categories but remain relatively imprecise about their actual phonetic realisation. At the same time, the existence of transcriptions between Tangut, Medieval Eastern Tibetan, Medieval Northwest Chinese, and liturgical Sanskrit means that any improvement in the reconstruction of Tangut phonology has immediate repercussions for our understanding of the synchronic and diachronic phonology of many other languages.
This talk examines R-like sounds in Sino-Tibetan, beginning with Tangut and progressively expanding outward to more distant relatives. Starting from a Pre-Tangut stage reconstructed through the comparative method—in which syllables can be inferred to contain segmental -r- in various syllabic positions—I predict the distribution of two distinct variables of Tangut phonology, with broader implications for the historical phonology of the Sino-Tibetan family.
🗣️ Dr Hannah Davidson(University of Cambridge)
📅 26th Feb, 2026 (16:30 - 17:00)
🏫 LB3, Sidgwick Site
Abstract:
This study investigates the processing and representation of morphologically-complex derived words by German-English bilinguals in their second language (English), compared to English monolinguals. Previous research on native German speakers has found decompositional processing and representation of all derived words irrespective of their semantic transparency, with semantically opaque verbs priming their stems (e.g. verstehen ‘understand’ - STEHEN ‘stand’) comparably to semantically transparent verbs. However, research in English has found no overt priming for opaque relationships between morphologically complex primes and their stem targets, suggesting that only transparent words might be decomposed in English. To this end we investigated whether German-English bilinguals tested in their L2 English behave similarly to English speakers, or whether they also decompose morphologically complex words across the transparency spectrum, like in their native German. The study used a cross-modal priming paradigm, with auditory primes (e.g. disagree) followed by visual targets (AGREE).
Participants completed a lexical decision task across four conditions: semantically opaque, semantically transparent, form/phonological and unrelated. Preliminary results revealed a significant effect of priming and condition, with the priming effect also varying by condition. There was no main effect of group but there was a significant three-way interaction between group, condition, and prime type, indicating that the priming pattern across conditions differed between bilinguals and monolinguals. Priming was considerably more prominent in transparent than in opaque pairs, in line with previous results in English. Yet while bilinguals showed the same overall pattern as monolinguals, their priming effects were of a significantly larger magnitude, suggesting greater sensitivity to word structure in German-English bilinguals. Whether and how L2 proficiency might mediate such effects will be investigated in further analyses.
🗣️ Prof John Williams (University of Cambridge)
📅 26th Feb, 2026 (16:30 - 17:00)
🏫 Little Hall Lecture Theatre, Sidgwick Site
Abstract:
Incidental language learning refers to the process of passively ‘picking up’ aspects of a language without intention to do so in the course of some activity. For example, while reading for pleasure in a foreign language one may spontaneously acquire vocabulary items or linguistic generalisations. Knowledge of grammatical generalisations in particular may remain at an unconscious level (as implicit knowledge) or they may emerge into awareness as spontaneous linguistic insight. The underlying learning mechanism is conceived in terms of generally associative / statistical learning, which has been shown to support lexical and syntactic learning in experiments using strings of nonsense syllables or nonwords. Here I report my ongoing efforts to develop an incidental language learning task that lends itself to the investigation of the mental processes involved in attaining spontaneous linguistic insight (e.g., the relationship between insight and prior implicit learning, and the neural markers of linguistic insight as measured by EEG). The vehicle for the investigation is a simple semi-artificial language (modelled on the Amazonian language Karitiana) in which novel overt markers for transitive and intransitive verb usage are combined with English lexis (e.g., Bill ro-ate the pizza, Mark ro-cooked the fish, Dave gi-slept soundly, Ryan gi-danced beautifully). Despite the apparent simplicity of this system, the templatic nature of the items and extensive training (128 sentences) I have found it surprisingly difficult to find a procedure that yields rule awareness in anything but about 25% of participants (who mostly report using intentional learning strategies and becoming aware of the system very quickly). There has been barely above chance test performance in the rule-unaware majority. Linguistic variations (of word order and marker positioning) and procedural variations that draw attention to the markers and their associated meanings do not improve the level of learning. Only by including additional surface-level cues has the level of learning improved. The general failure to learn is surprising from a statistical (or general associative) learning perspective. The purpose of this talk is to share my own surprise at this failure to learn, to consider the results in relation to different learning theories, and to question the power of statistical learning in meaningful linguistic contexts.
🗣️ Prof Nigel Collier (University of Cambridge)
📅 26th Feb, 2026 (16:00 - 16:30)
🏫 Little Hall Lecture Theatre, Sidgwick Site
Abstract:
In this talk I will touch on themes from our recent work on uncertainty and calibration in large language models, exploring how models represent, estimate, and communicate what they do not know. LLMs tend to be over-confident, whether they are right or wrong. As they are increasingly used in open-ended and long-form generation tasks, the ability to quantify and express epistemic uncertainty becomes crucial, not only for safety and reliability, but also for naturalistic interaction. I finish by proposing what I call Sunao intelligence, from the Japanese sunao, meaning open, sincere, and attuned to reality. It reframes calibration as more than a technical goal, envisioning systems that recognise their own limits and express them with humility, transparency, and genuine understanding of human intention.
🗣️ Dr Norma Schifano (University of Cambridge)
📅 12th Feb, 2026 (16:00 - 16:30)
🏫 Little Hall Lecture Theatre, Sidgwick Site
Abstract:
Truth-conditional semantics has been successful in explaining how the meaning of a sentence can be decomposed into the meanings of its parts, and how this allows people to understand new sentences. In this talk, I will discuss how a truth-conditional model can be learnt in practice on large-scale datasets of various kinds (textual, visual, ontological), and how this is empirically useful, compared to non-truth-conditional models. I will then take stock of the bigger picture, and argue it is (unfortunately) computationally intractable to reduce all kinds of language understanding to truth conditions. To enable a more complete account, I will sketch a new approach to probabilistic modelling, which maintains tractability by relaxing the strict demands of Bayesian inference. This has the potential to explain how patterns of language use arise as a result of computationally constrained minds interacting with a computationally demanding world.
🗣️ Prof Ian Roberts (University of Cambridge)
📅 5th Feb, 2026 (16:00 - 17:00)
🏫 Little Hall Lecture Theatre, Sidgwick Site
Abstract:
On standard Chomskyan assumptions, children, armed with Universal Grammar (UG) as the initial state of language acquisition, discover the grammar – arrive at the final state of acquisition -- of their native language by somehow “matching” their linguistic experience with the parameters of UG, by exposure to the vocabulary, phonology and closed-class grammatical items of the target language. Nurture (these aspects of the linguistic environment the child encounters) interacts with nature (UG). But what is this “matching” process?
Two types of answer to this question have been proposed. The most widespread, since Chomsky (1957), relies on the idea that children postulate grammars based on the interaction of their native UG with experience and choose the correct grammar following an “evaluation metric” of some kind. But there is another, arguably simpler, approach, originating in earlier North American structuralist linguistics and discussed in Chomsky’s early work (1951, 1955/1975): children discover the grammar for their language by means of a learning procedure, which effects the correct “match” between UG and linguistic experience. To use the old terminology: children apply a discovery procedure to marry experience to a grammar; the discovery procedure is the combination of the learning procedure and UG. It was impossible to pursue a discovery procedure in the 1950s: There was very little understanding of how children learn languages, there were no corpora of linguistic evidence, and the theories of computation and learning were still in their infancy. The discovery-procedure based approach has been revived recently, really for the first time since 1957, by Charles Yang. Since this approach does not require a separate evaluation metric, Occam’s razor leads us to favour it.
The present project aims to build on Yang’s insights and show how important, well-described aspects of the grammars of a range of languages, focussing on null subjects and word-order variation, can be better understood in terms of discovery procedures than in terms of evaluation metrics.
🗣️ Dr Guy Emerson (University of Cambridge)
📅 13th Nov, 2025 (16:00 - 16:30)
🏫 GR-05, English Faculty Building
Abstract:
Truth-conditional semantics has been successful in explaining how the meaning of a sentence can be decomposed into the meanings of its parts, and how this allows people to understand new sentences. In this talk, I will discuss how a truth-conditional model can be learnt in practice on large-scale datasets of various kinds (textual, visual, ontological), and how this is empirically useful, compared to non-truth-conditional models. I will then take stock of the bigger picture, and argue it is (unfortunately) computationally intractable to reduce all kinds of language understanding to truth conditions. To enable a more complete account, I will sketch a new approach to probabilistic modelling, which maintains tractability by relaxing the strict demands of Bayesian inference. This has the potential to explain how patterns of language use arise as a result of computationally constrained minds interacting with a computationally demanding world.
🗣️ Prof Bert Vaux (University of Cambridge)
📅 16th Oct, 2025 (16:30 - 17:00)
🏫 SG1, Alison Richard Building
Abstract:
This talk draws on case studies from my work to illustrate three complementary prongs of the relatively new field of Language Analysis for Determination of Origin (LADO) : predictive, forensic, and asylum. The first prong introduces a Bayesian localization model that uses my crowdsourced big data corpus to predict a speaker’s regional background from linguistic survey responses with surprising accuracy, refining prior methods through probabilistic weighting of feature co-occurrence. The second prong involves authorship identification in a legal context, illustrated by analysis of a deceased billionaire’s disputed will. The third addresses the use of linguistic evidence to establish whether applicants for political asylum are from where they claim.
🗣️ Prof Kirsty McDougall (University of Cambridge)
📅 16th Oct, 2025 (16:00 - 16:30)
🏫 SG1, Alison Richard Building
Abstract:
In certain crimes, a perpetrator’s voice may have been heard by a witness, but not recorded. If the police have identified a suspect, a phonetician may be asked to prepare a ‘voice parade’ to test whether the witness recognises the voice of the suspect as that of the perpetrator heard at the crime scene. Analogous to a visual identity parade, the witness is presented with a line-up of recordings which includes the suspect’s voice and a number of foil voices.
Selection of the foil voices is a challenging aspect of voice parade construction for which theory is still evolving. The foil voices should sound similar to the suspect’s voice to provide a fair comparison, yet the phonetic underpinnings of perceived voice similarity are not well understood, such that the principles for selecting foil voices are not straightforward. This talk will present some experimental work investigating the phonetic correlates of perceived voice similarity in accents of British English, considering the roles played by aspects of speech such as fundamental frequency, formant frequencies, voice quality and articulation rate. Implications of the findings for voice parade construction will be discussed.