Program
10.00 –10.45
Introduction
Coffee Break
11:00–12:00
Leonie Weissweiler (LMU Munich, Germany) - Testing the Limits of LLMs with Construction Grammar
Abstract: Construction Grammar (CxG) is a paradigm from cognitive linguistics that emphasises the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of language, i.e., linguistic units of different granularity, each with a form and a meaning component. It enables us to find and describe less frequent, idiosyncractic features of language, which can then be investigated with novel probing methodology to exemplify some of the shortcomings of state-of-the-art language modelling.
In this talk, we will briefly introduce Construction Grammar for an NLP audience. We then showcase several of our recent and ongoing studies on the modelling of constructions by large language models, shedding a light on the limitations of their current abilities of compositionality and generalisation. We will further highlight exciting new developments for Construction Grammar and NLP, from the beginning of a Construction layer in Universal Dependencies, to its central role in the current debate in linguistics about the interpretation of the recent successes of LLMs.
12:00–13:00
William Merrill (NYU, USA) - The Parallelism Tradeoff: Limitations of Log-Precision Transformers
Abstract: Despite their omnipresence in modern NLP, characterizing the expressive power of transformers to recognize formal languages remains an interesting open question. We prove that transformers whose arithmetic precision is logarithmic in the number of input tokens (and whose feedforward nets are computable using space linear in their input) can be simulated by constant-depth logspace-uniform threshold circuits. This provides insight on the power of transformers using known results in complexity theory. For example, if TC0 ≠ NC1, then transformers cannot recognize all regular languages, and if TC0 ≠ NL, transformers cannot solve graph connectivity questions, which underlie many types of logical reasoning. Our result intuitively emerges from the transformer architecture’s high parallelizability. We thus speculatively introduce the idea of a fundamental parallelism tradeoff: any model architecture as parallelizable as the transformer will obey limitations similar to it. Since parallelism is key to training models at massive scale, this suggests a potential inherent weakness of the scaling paradigm.
Lunch Break 13:00 –14:00
14:00 –14:45
Poster Session for SAIL PhD students
14:45 – 15:45
Panel Discussion with Leonie Weissweiler, William Merrill, Silke Schwandt and Philipp Cimiano
Coffee Break
16:00 – 17:00
Iris van Rooij (School of Artificial Intelligence and Donders Institute for Brain, Cognition and Behaviour, Radboud University, The Netherlands & Department of Linguistics, Cognitive Science, and Semiotics, and the Interacting Minds Centre at Aarhus University, Denmark) - Reclaiming AI as a theoretical tool for cognitive science
17:00 –18:00
Nouha Dziri (Allen Institute for AI, USA) - Limits of Transformers on Compositionality