The First Workshop on Structure & Generalization in Multimodal Language Understanding (SAGE-MLU)

Program

The preliminary program features four keynote speakers, an interactive panel discussion, and a hackathon.

Location on VU Campus:

Monday 10 March: Alma 1 & 2 (in the OZW building)

Tuesday 11 March: NU-5A65 (in the NU building)

Monday, 10 March 2025

11:00-11:30 Welcome & Introduction to Workshop

11:30-12:00 Keynote Speaker 1: Juri Opitz

12:00-12:30 Keynote Speaker 2: Najoung Kim

12:30-13:45 Lunch: Roundtable discussions

14:00-14:30 Keynote Speaker 2: Adam Dahlgren Lindström

14:30-15:00 Keynote Speaker 3: Martha Lewis

15:00-15:15 Break

15:15-16:00 Panel Discussion:

Agnes Axelsson, Marius Dorobantu,

Filip Ilievski, Urja Khurana

Moderated by Jonas Groschwitz

16:00-16:30 Keynote Speaker 4: Kanishka Misra

16:30-17:30 Break

17:30 Dinner (De Veranda)

Tuesday, 11 March 2025

9:30-10:00 Walk in with coffee

10:00-12:30 Hackathon part 1

12:30-13:30 Lunch

13:30-15:30 Hackathon part 2

15:30-16:00 Break

16:00-17:00 Hackathon presentations & discussion

17:00-17:30 Closing & Farewell

Optional evening dinner/social event

Keynote Talks

Stay tuned for updates!

Juri Opitz, "Structures of Text, Image ... and Radio! Experiences from the Impresso Project"

Najoung Kim, "Evaluating Multimodal Critic Models for Human-AI Co-Creation"

Adam Dahlgren Lindström, "Consistency is Key: Reasoning and Multimodality"

Martha Lewis, "Compositional Approaches to Modelling Language and Concepts"

Recent neural approaches to modelling language and concepts have proven quite effective, with a proliferation of large models trained on correspondingly massive datasets. However, these models still fail on some tasks that humans, and symbolic approaches, can easily solve. Large neural models are also, to a certain extent, black boxes - particularly those that are proprietary. There is therefore a need to integrate compositional and neural approaches, firstly to potentially improve the performance of large neural models, and secondly to analyze and explain the representations that these systems are using. In this talk I will present results showing that large neural models can fail at tasks that humans are able to do, and discuss alternative, theory-based approaches that have the potential to perform more strongly. I will give applications in language, reasoning, and vision. Finally, I will present some future directions in understanding the types of reasoning or symbol manipulation that large neural models may be performing.

Kanishka Misra, "Similarity and Category Membership in the Property Inferences of Language Models"

Page updated

Google Sites

Report abuse