The First Workshop on Structure & Generalization in Multimodal Language Understanding (SAGE-MLU)
The preliminary program features four keynote speakers, an interactive panel discussion, and a hackathon.
Location on VU Campus:
Monday 10 March: Alma 1 & 2 (in the OZW building)
Tuesday 11 March: NU-5A65 (in the NU building)
11:00-11:30 Welcome & Introduction to Workshop
11:30-12:00 Keynote Speaker 1: Juri Opitz
12:00-12:30 Keynote Speaker 2: Najoung Kim
12:30-13:45 Lunch: Roundtable discussions
14:00-14:30 Keynote Speaker 2: Adam Dahlgren Lindström
14:30-15:00 Keynote Speaker 3: Martha Lewis
15:00-15:15 Break
15:15-16:00 Panel Discussion:
Agnes Axelsson, Marius Dorobantu,
Moderated by Jonas Groschwitz
16:00-16:30 Keynote Speaker 4: Kanishka Misra
16:30-17:30 Break
17:30 Dinner (De Veranda)
9:30-10:00 Walk in with coffee
10:00-12:30 Hackathon part 1
12:30-13:30 Lunch
13:30-15:30 Hackathon part 2
15:30-16:00 Break
16:00-17:00 Hackathon presentations & discussion
17:00-17:30 Closing & Farewell
Optional evening dinner/social event
Stay tuned for updates!
Juri Opitz, "Structures of Text, Image ... and Radio! Experiences from the Impresso Project"
Najoung Kim, "Evaluating Multimodal Critic Models for Human-AI Co-Creation"
Adam Dahlgren Lindström, "Consistency is Key: Reasoning and Multimodality"
Martha Lewis, "Compositional Approaches to Modelling Language and Concepts"
Recent neural approaches to modelling language and concepts have proven quite effective, with a proliferation of large models trained on correspondingly massive datasets. However, these models still fail on some tasks that humans, and symbolic approaches, can easily solve. Large neural models are also, to a certain extent, black boxes - particularly those that are proprietary. There is therefore a need to integrate compositional and neural approaches, firstly to potentially improve the performance of large neural models, and secondly to analyze and explain the representations that these systems are using. In this talk I will present results showing that large neural models can fail at tasks that humans are able to do, and discuss alternative, theory-based approaches that have the potential to perform more strongly. I will give applications in language, reasoning, and vision. Finally, I will present some future directions in understanding the types of reasoning or symbol manipulation that large neural models may be performing.
Kanishka Misra, "Similarity and Category Membership in the Property Inferences of Language Models"