Image source: https://www.freevector.com/psychedelic-pattern-28724
Programme
All sessions to be held in Humanisten, Renströmsgatan 6, room J222
Day 1 — Monday 11th September
09.00 - 09.30 Welcome and introductory remarks from General Chairs
09.30 - 10.30 Keynote 1
Aurélie Herbelot – Decentralised semantics
Large Language Models (LLMs) are currently the dominant paradigm in the field of Natural Language Processing. But their enormous architecture, coupled with an insatiable hunger for training data, makes them ill-suited for many purposes, ranging from fundamental linguistic research to small business applications. The main argument of this talk is that the monolithic architecture of LLMs, and by extension their reliance on big data, is a direct consequence of a lack of semantic theory in the underlying model. As an alternative, I will explore a modular architecture based on concepts from model theory, which lends itself to decentralised training over small data. Starting from research in linguistics and cognitive science, I will summarise evidence against the view that language competence should 'live' in a single high-dimensional space. I will then review various computational models of meaning at the junction between formal and distributional approaches, and show how they can be combined into a modular system. Finally, I will present a possible implementation where learning takes place over individual situation types, at low dimensionality. This decentralised approach has natural benefits in terms of accessibility and energy efficiency.
10.30 - 11.00 Coffee
11.00 - 12.30 Oral session 1
Improving Few-Shot Learning with Multilingual Transfer and Monte Carlo Training Set Selection (Antonis Maronikolakis, Paul O'Grady, Hinrich Schütze and Matti Lyra) [long paper] ONLINE
Smooth Sailing: Improving Active Learning for Pre-trained Language Models with Representation Smoothness Analysis (Josip Jukić and Jan Snajder) [long paper] ONLINE
Entrenchment Matters: Investigating Positional and Constructional Sensitivity in Small and Big Language Models (Bastian Bunzeck and Sina Zarrieß) [long paper]
12.30 - 13.30 Lunch
13.30 - 14.30 Oral session 2
Facilitating learning outcome assessment-- development of new datasets and analysis of pre-trained language models (Akriti Jindal, Kaylin Kainulainen, Andrew Fisher and Vijay Mago) [long paper] ONLINE
Because is why: Children's acquisition of topoi through why questions (Christine Howes, Ellen Breitholtz and Vladislav Maraev) [long paper]
14.30- 16.00 Poster session 1 (including coffee)
Do Language Models discriminate between relatives and pseudorelatives? (Adele Henot-Mortier) [student paper]
When Speech Becomes Writing: The Case of Disfluencies (Aida Tarighat and Martin Corley) [extended-abstract]
Preparing a corpus of spoken Xhosa (Eva-Marie Bloom Ström, Onelisa Slater, Aron Zahran, Aleksandrs Berdicevskis and Anne Schumacher)[short paper]
On the Challenges of Training Language Models on Historical Data (Julius Steuer, Marius Mosbach and Dietrich Klakow) [extended-abstract]
Leveraging GPT-Sw3 for Faroese to English Translation (Barbara Scalvini and Iben Nyholm Debess) [extended-abstract]
Machine Translation of Folktales: small-data-driven and LLM-based approaches (Olena Burda-Lassen) [short paper] ONLINE
Example-Based Machine Translation with a Multi-Sentence Construction Transformer Architecture (Haozhe Xiao, Yifei Zhou and Yves Lepage) [student paper]
Reconstruct to Retrieve: Identifying interesting news in a Cross-lingual setting (Boshko Koloski, Blaz Skrlj, Nada Lavrac and Senja Pollak) [student paper]
Linguistic Pattern Analysis in the Climate Change-Related Tweets from UK and Nigeria (Ifeoluwa Wuraola, Nina Dethlefs and Daniel Marciniak) [student paper]
Nut cracking Sledgehammers: Prioritizing Target Language Data over Bigger Language Models for Cross Lingual Metaphor Detection (Jakob Schuster and Katja Markert) [student paper]
16.00-17.00 Keynote 2
Danielle Matthews – How children learn to use language through interaction.
This talk will chart out pragmatic development with a focus on the real-world experiences that allow infants to start using language for social communication and permit children to use it at ever more complex levels. Following a working definition of pragmatics in the context of human ontogeny, we will trace the early steps of development, from a dyadic phase, through to intentional triadic communication and early word use before briefly sketching out later developments that support adult-like communication at the sentential and multi-sentential levels and in literal and non-literal ways. Evidence will be provided regarding the experiential basis of learning from the study of individual differences, from randomised controlled trials and from deaf infants growing up in families with little prior experience of deafness (and who are thus at risk of reduced access to interaction). This will provide a summary of elements from a forthcoming book: Pragmatic Development: How children learn to use language for social communication.
Day 2 — Tuesday 12th September
09.30 - 10.30 Keynote 3
Tal Linzen – How much data do neural networks need for syntactic generalization?
I will discuss work that examines the syntactic generalization capabilities of contemporary neural network models such as transformers. When trained from scratch to perform tasks such transforming a declarative sentence to a question, models generalize in ways that are very different from humans. Following self-supervised pre-training (word prediction), however, transformers generalize in line with syntactic structure. Robust syntactic generalization emerges only after exposure to a very large amount of data, but even more moderate amounts of pre-training data begin to steer the models away from their linear inductive biases. Perhaps surprisingly, pre-training on simpler child-directed speech is more data-efficient than on other genres; at the same time, this bias is insufficient for a transformer to learn to form questions correctly just from the data available in child-directed speech.
10.30 - 11.00 Coffee
11.00 - 12.30 Oral session 3
Geometry-Aware Supertagging with Heterogeneous Dynamic Convolutions (Konstantinos Kogkalidis and Michael Moortgat) [long paper]
UseClean: learning from complex noisy labels in named entity recognition (Jinjin Tian, Kun Zhou, Meiguo Wang, Yu Zhang, Benjamin Yao, Xiaohu Liu and Chenlei Guo) [long paper] ONLINE
Benchmarking Neural Network Generalization for Grammar Induction (Nur Lan, Emmanuel Chemla and Roni Katzir) [long paper]
12.30 - 13.30 Lunch
13.30 - 14.30 Oral session 4
A Sanskrit grammar-based model to identify and address gaps in Google Translate's Sanskrit-English zero-shot NMT (Amit Rao and Kanchi Gopinath) [long paper]
From web to dialects: how to enhance non-standard Russian lects lemmatisation? (Ilia Afanasev and Olga Lyashevskaya) [long paper] ONLINE
14.30 - 16.00 Poster session 2 (including coffee)
Lexical Semantics with Vector Symbolic Architectures (Adam Roussel) [extended-abstract]
Masked Latent Semantic Modeling: an Efficient Pre-training Alternative to Masked Language Modeling (Gábor Berend) [extended-abstract]
Neural learning from small data using formal semantics (Staffan Larsson, Robin Cooper, Jonathan Ginzburg and Andy Luecking) [extended-abstract]
An Incremental Model of Garden-Path Sentences Using Sheaf Theory (Daphne Wang and Mehrnoosh Sadrzadeh) [extended-abstract]
Improving BERT Pretraining with Syntactic Supervision (Georgios Tziafas, Konstantinos Kogkalidis, Gijs Wijnholds and Michael Moortgat) [short paper]
Few-shot learning of word properties in a trained LSTM (Priyanka Sukumaran, Nina Kazanina and Conor Houghton) [extended-abstract]
Towards Recursion in Emergent Numeral Systems (Jonathan Thomas and Moa Johansson) [extended-abstract]
Low-data Regime Multimodal Learning with Adapter-based Pre-training and Prompting (Wenyan Li, Dong Li, Wanjing Li, Yuanjie Wang, Hai Jie and Yiran Zhong) [short paper] ONLINE
On the role of resources in the age of large language models (Simon Dobnik and John Kelleher) [short paper]
16.00 - 17.00 Keynote: closing talk by CLASP's chief scientist
Shalom Lappin – Assessing the Strengths and Weaknesses of Large Language Models
The transformers that drive chatbots and other AI systems constitute large language models (LLMs). These are currently the focus of a lively discussion in both the scientific literature and the popular media. This discussion ranges from hyperbolic claims that attribute general intelligence and sentience to LLMs, to the skeptical view that these devices are no more than “stochastic parrots”. In this talk I will present an overview of some of the weak arguments that have been presented against LLMs, and I will consider several more compelling criticisms of these devices. The former significantly underestimate the capacity of transformers to achieve subtle inductive inferences required for high levels of performance on complex, cognitively significant tasks. In some instances, these arguments misconstrue the nature of deep learning. The latter criticisms identify significant limitations in the way in which transformers learn and represent patterns in data. They also point out important differences between the procedures through which deep neural networks and humans acquire knowledge of natural language. It is necessary to look carefully at both sets of arguments in order to achieve a balanced assessment of the potential and the limitations of LLMs.