Speakers

Anna Ivanova

MIT

Talk title: Dissociating language and thought in large language models

Abstract: Today’s large language models (LLMs) routinely generate coherent, grammatical and seemingly meaningful paragraphs of text. This achievement has led to speculation that LLMs have become “thinking machines”, capable of performing tasks that require reasoning and/or world knowledge. In this talk, I will introduce a distinction between formal competence—knowledge of linguistic rules and patterns—and functional competence—understanding and using language in the world. This distinction is grounded in human neuroscience, which shows that formal and functional competence recruit different cognitive mechanisms. I will show that the word-in-context prediction objective has allowed LLMs to essentially master formal linguistic competence; however, pretrained LLMs still lag behind at many aspects of functional linguistic competence, prompting engineers to adopt specialized fune-tuning techniques and/or couple an LLM with external modules. I will illustrate the formal-functional distinction using the domains of English grammar and arithmetic, respectively. I will then turn to  generalized world knowledge, a domain where this distinction is much less clear-cut, and discuss our efforts to leverage both cognitive science and NLP to develop systematic ways to probe generalized world knowledge in text-based LLMs. Overall, the formal/functional competence framework clarifies the discourse around LLMs, helps develop targeted evaluations of their capabilities, and suggests ways for developing better models of real-life language use.

Bio: Anna (Anya) Ivanova is a Postdoctoral Associate at MIT Quest for Intelligence and an incoming Assistant Professor at Georgia Tech Psychology (starting Jan 2024). She has a PhD from MIT’s Department of Brain and Cognitive Sciences, where she studied the neural mechanisms underlying language processing in humans. Today, Anya is examining the language-thought relationship not only in human brains, but also in large language models, using her cognitive science training to identify similarities and differences between people and machines. 


David Krueger

University of Cambridge

Talk title: Limitations of Fine-Tuning for Aligning LLMs

Bio: David Krueger is a research director at the UK Frontier AI Task Force and an Assistant Professor of Machine Learning at the University of Cambridge.  His work focuses on reducing the risk of human extinction from artificial intelligence (AI x-risk) through technical research as well as education, outreach, governance, and advocacy.  His research spans many areas of Deep Learning, AI Alignment, AI Safety, and AI Ethics, including alignment failure modes, algorithmic manipulation, interpretability, robustness, and understanding how AI systems learn and generalize.  He has been featured in media outlets including ITV's Good Morning Britain, Al Jazeera's Inside Story, France 24, New Scientist, and the Associated Press.  David completed his graduate studies at the University of Montreal and Mila, working with Yoshua Bengio, Roland Memisevic, and Aaron Courville, and he is a research affiliate of Mila, UC Berkeley's Center for Human-Compatible AI (CHAI), and the Center for the Study of Existential Risk (CSER) at the University of Cambridge.


Vianney Perchet

ENSAE Paris

Talk title: Active and Online Learning with Large (and Combinatorial) Models

Abstract: Active learning consists in sequentially and adaptively constructing a data-set in the hope of improving the learning speed by avoiding useless data-points where the current model is already correct with large probability and by focusing on uncertainty regions. During this talk, I will give a short reminder on the potential benefits and pitfalls of active learning, especially in large and combinatorial models.

Bio: Vianney Perchet is a professor at the Center of Research in Economics and Statistics  (CREST) at the ENSAE since October 2019. Mainly focusing on the interplay between machine learning and game theory, his themes of research are at the junction of mathematics, computer science, and economics. The spectrum of his interest ranges from pure theory  (say, optimal rates of convergence of algorithms) to pure applications (modeling user behavior, optimization of recommender systems, etc.) He is also a part-time principal researcher in the Criteo AI Lab, in Paris, working on efficient exploration in recommender systems.

Aaron Schein

University of Chicago

Talk Title: Measurement in the Age of LLMs: An Application to Political Ideology Scaling

Abstract: Much of social science is centered around terms like “ideology” or “power”, which generally elude precise definition, and whose contextual meanings are trapped in surrounding language. This talk explores the use of large language models (LLMs) to flexibly navigate the conceptual clutter inherent to social scientific measurement tasks. We rely on LLMs’ remarkable linguistic fluency to elicit ideological scales of both legislators and text, which accord closely to established methods and our own judgement. A key aspect of our approach is that we elicit such scores directly, instructing the LLM to furnish numeric scores itself. This approach is methodologically “dumb” and shouldn’t “work” according to classical principles of measurement. We nevertheless find surprisingly compelling results, which we showcase through a variety of different case studies.

Bio: Aaron Schein is an Assistant Professor at the University of Chicago in the Department of Statistics and the Data Science Institute. His research develops methodology in Bayesian statistics, machine learning, and applied causal inference for incorporating modern large-scale data into the social sciences. Prior to joining the University of Chicago, Aaron was a postdoctoral fellow in the Data Science Institute at Columbia University, where he worked with David Blei and Donald Green on digital field experiments to assess the causal effects of friend-to-friend organizing on voter turnout in US elections. Aaron received his PhD in Computer Science in 2019 from UMass Amherst under the guidance of Hanna Wallach. His dissertation developed tensor factorization and dynamical systems models for analyzing large-scale dyadic data of country-to-country interactions. During his PhD, Aaron interned at Microsoft Research, Google, and the MITRE Corporation. Prior to that, he earned his MA in Linguistics and BA in Political Science from UMass Amherst. He is on Twitter @AaronSchein.

Wilfried Wöber

UAS Technikum Wien

Talk Title: Machine learning and morphology: Opportunities and challenges

Abstract: Morphology in evolutionary biology is used to quantify visible characteristics of specimens, a crucial aspect in addressing the biodiversity crisis. To investigate the impact of anthropogenic impacts, researchers have constructed extensive image databases. Obviously, these databases make the field optimal for the integration of machine learning. However, traditional methods used in morphometrics are grounded in diagnostic structures proposed by biologists. In contrast to that, machine learning approaches autonomously extract features without explicit biological motivation. 

This talk focuses on the potential misunderstandings that can arise when applying machine learning in morphometrics. Specifically, the focus is on the biological interpretation of machine learning models, exploring instances where models demonstrate high accuracy yet struggle with coherent biological interpretation. The presentation showcases experiments that highlight the tension between excellent quantitative results but often lacks biological interpretation.

Bio: Wilfried Wöber earned his master’s degree in robotics in 2013. Over the next four years, he worked as a data scientist in the field of agricultural robotics focusing on machine learning-based applications. Afterward, in 2017, Wilfried changed to the faculty of industrial engineering of the university of applied sciences Technikum Wien, where he established a mobile- and service robotics lab. 

In 2019, driven by a passion for data analysis in biology, Wilfried started a Ph.D. journey, focusing specifically on morphometrics. Throughout this research, he encountered various unexpected behavior of machine learning models.

Panelists


David Krueger

University of Cambridge

Tatiana Likhomanenko

Apple Inc 

Christoph Lampert

ISTA

 Aaron Schein

University of Chicago

Moderated by Naomi Saphra.