SCHEDULE

Monday, June 19 (second workshop day) 

Room: East 13


UTC-7 / GMT-7 (Vancouver) 

09:00 am Welcome Remarks: Timo Sämann

09:10 Invited Talk*:  Marcus Rohrbach

09:40 Invited Talk*: Yuning Chai

10:10 Long Orals** (IDs 1, 2, 3)

10:40 Coffee Break

11:00 Invited Talk*: Dengxin Dai 

11:30 Invited Talk*: Raquel Urtasun  

12:00 Long Orals** (IDs 4, 5)

12:20 Lunch Break

01:30 Invited Talk*: Wojciech Samek

02:00 Short Orals*** (IDs 6-11)

02:35 Coffee Break

02:50 Poster Session (ID 1 - 11)****

04:00 Invited Talk*: Björn Ommer

04:30 Best Paper Award (ID 1 - 11)

05:00 Closing


*Invited Talks: 30 min inclusive Q&A

**Paper Long Orals: 9 min (Q&A at the poster)

***Paper Short Orals: 4 min (Q&A at the poster)

****Poster Session: See accepted papers for more details

Marcus Rohrbach

Artificial Intelligence (AI) is becoming increasingly potent, however, at the same time, it is increasingly confronted with concerns in the society and scientific community. The vision of my research is to transform AI to become reliable and trustworthy. Reliable reasoning over multiple modalities is critical as AI systems often make independent decisions when analyzing large amounts of multimodal data and become omnipresent in assisting humans in their daily lives. In this talk, I will focus on our observations that current multimodal models are unable to answer questions with a low risk of error as they lack self-awareness, both in-distribution and especially in out-of-distribution (OOD) settings. I will show how to train a selector model and introduce our approach Learning from Your Peers (LYP) to enable models to abstain successfully, i.e. become more reliable. I will conclude by pointing out that the problem is far from solved and several other directions of our research towards reliable and safe multimodal AI.

Yuning Chai

Title: Proactive AV Testing via Adversarial Asset and Scenario Generation 

The standard testing approach within the autonomous vehicles (AV) industry has been firmly reactive: people experience issues on the road and fix them. However, as we expand to large-scale deployments of driverless operations across the globe, waiting for incidents to happen organically is no longer viable. In this talk, we will highlight works that leverage advances in adversarial generation to stress test our AV stack. We create both new assets/poses for perception, and new scenarios for behavior, and make sure that they expose potential weaknesses while remaining highly interpretable. 

Dengxin Dai



Raquel Urtasun


Title: Next Generation Simulation for the Safe Development and Deployment of Self-Driving Technology 


Wojciech Samek

The emerging field of Explainable AI (XAI) aims to bring transparency to today's powerful but opaque deep learning models. This talk will present Concept Relevance Propagation (CRP), a next-generation XAI technique which explains individual predictions in terms of localized and human-understandable concepts. Other than the related state-of-the-art, CRP not only identifies the relevant input dimensions (e.g., pixels in an image) but also provides deep insights into the model’s representation and the reasoning process. This makes CRP a perfect tool for AI-supported knowledge discovery in the sciences. In the talk we will demonstrate on multiple datasets, model architectures and application domains, that CRP-based analyses allow one to (1) gain insights into the representation and composition of concepts in the model as well as quantitatively investigate their role in prediction, (2) identify and counteract Clever Hans filters focusing on spurious correlations in the data, and (3) analyze whole concept subspaces and their contributions to fine-grained decision making. By lifting XAI to the concept level, CRP opens up a new way to analyze, debug and interact with ML models, which is of particular interest in safety-critical applications and the sciences

Björn Ommer

Title: Democratizing Generative AI: Stable Diffusion & the Revolution in Visual Synthesis

Recently, deep generative modeling has become the most prominent paradigm for learning powerful representations of our (visual) world and for generating novel samples thereof. This talk will contrast the most commonly used generative models to date with a particular focus on denoising diffusion probabilistic models. Despite their enormous potential, these models come with their own specific limitations. We will then discuss a solution, latent diffusion models a.k.a. "Stable Diffusion", that significantly improves the efficiency of diffusion models. Now billions of training samples can be summarized in compact representations that render the approach feasible on consumer hardware. Making high-quality visual synthesis accessible to everyone has revolutionized the way we create visual content and spurred research and the development of numerous novel applications. We will then discuss recent extensions that cast an interesting perspective on future generative modelling: Rather than having powerful likelihood models memorize local image details, we focus their representational power on scene composition. Time permitting, the talk will also cover approaches to post-hoc interpretation of the learned neural representations.