Invited Speakers
We are excited to have the following invited speakers at RepL4NLP 2022.
Google Brain
Title: Bridging the representation gap between humans and machines: first steps
Abstract: Machines and humans are likely to understand the world differently. They may share some of their representational spaces, but not all. Bridging the gap between the two representational spaces is the key for the future ML in both developing and understanding them. To do this, I argue we need to develop a language based on two axes 1) studying AI machines as scientific objects, in isolation and with humans and 2) expanding what we know so that we can learn from and be inspired by AI. These studies will not only provide principles for tools we make, but also are necessary to take our working relationship with AI to the next level. This language will not be perfect– no language is–but it will be useful. As human language is known to shape our thinking, this will also shape us and future AI.
Carnegie Mellon University
Title: Environmental Impacts of Representation Learning: Why, what, how?
Abstract: Large, pre-trained language models (LMs) produce high quality, general purpose representations of word (pieces) in context, shifting the paradigm for representation learning in NLP. Unfortunately, training and deploying these models comes at a high computational cost, limiting their development and use to the relatively small set of individuals and organizations with access to substantial computational resources. In this talk I’ll discuss: Why this is a problem, what we can do to make large pre-trained language models more accessible, and how we can leverage our expertise to help mitigate climate change beyond the operational emissions due to large LM training.
Microsoft Turing, India
Title: Predicting and Explaining Cross-lingual Zero-shot and Few-shot Transfer
Abstract: Given a massively multilingual language models (MMLM), can we predict the accuracy of cross-lingual zero-shot and few-shot transfer for a task on target languages with little or no test data? This seemingly impossible task, if solved, can have several potential benefits. First, we could estimate the performance of a model even in languages where a test set is not available, and/or building one is difficult. Second, one can predict training data configurations that would give certain desired performance across a set of languages, and accordingly strategize data collection plans; this in turn can lead to linguistically fair MMLM-based models. Third, as a byproduct, we would know which factors influence cross-lingual transfer. In this talk, I will give an overview of Project LITMUS – Linguistically Informed Training and Testing of Multilingual Systems, where we build several ML models for performance prediction; besides their applications, I will discuss what we learn about the factors that influence cross-lingual transfer.
Stanford University
Title: Are Foundation Models Castles in the Air?
Abstract: Despite the impressive capabilities of foundation models such as GPT-3 and PaLM, today's models have clear limitations. Some would argue that they will never achieve "true language understanding" because they are so different from humans. In this talk, I will point out four differences between existing language models and humans -- multimodality, access to actions, active data collection, and architecture -- and ruminate on the importance of each. We argue that the interesting question is how far a foundation model can get to "human-level understanding" in spite of these differences, even if you never get there completely.
University College London & Meta AI
Title: Parametric vs Nonparametric Knowledge, and what we can learn from Knowledge Bases
Abstract: Traditionally, AI and Machine Learning communities have considered knowledge from the perspective of discrete vs continuous representations, knowledge bases (KBs) vs dense vectors or logic vs algebra. While these are important dichotomies, in this talk I will argue that we should put more focus on another: parametric vs non-parametric modelling. Roughly, in the former a fixed set of parameters is used, in the latter parameters grow with data. I will explain recent advances in knowledge intensive NLP from this perspective, show the benefit of hybrid approaches, and discuss KBs as non-parametric approaches with relatively crude assumptions about what future information needs will be. By replacing these assumptions with a learnt model, we show that such “modern KBs” are a very attractive alternative or complement to current approaches.