Vancouver Vision & Learning Workshop @ ICML 2025

9am-5pm, July 14th

Segal Conference Rooms 1400-1430 (@1st floor)

📍🔗 515 W Hastings St, Vancouver, BC V6B 4N6
right beside the Harbour Centre, Downtown Vancouver
(10-min walk from the Vancouver Convention Centre)

We are excited to introduce the Vancouver Vision & Learning Workshop @ ICML 2025! Our workshop is hosted jointly by Simon Fraser University (SFU), the University of British Columbia (UBC), and the Vector Institute, three leading institutions at the forefront of machine learning research. This workshop brings together researchers, students, and engineers to explore the latest developments in computer vision, machine learning, language, and more, with a focus on fostering collaboration and innovation across academia and industry.

Hosted during ICML 2025, our workshop reflects the vibrant learning ecosystem in Vancouver. SFU and UBC researchers have long contributed foundational work across areas such as visual computing, language modeling, and reinforcement learning, and this event will highlight these topics. Attendees can expect a program of invited talks from world-class speakers, cutting-edge research presentations, and opportunities for conversation and socializing over lunch and breaks. We welcome participants from the global ICML community to join us in Vancouver for this exciting addition to the conference week.

Slides with Schedule + Speaker Information
Slides with Lightning Talks:
Morning session and Afternoon session

Speakers (A-Z by Last Name)

Kelsey Allen

UBC, Vector Institute

Jamie Shotton

Wayve

Masashi Sugiyama

RIKEN AIP, University of Tokyo

Alane Suhr

UC Berkeley

Arash Vahdat

NVIDIA

Qiang Yang

Hong Kong Polytechnic University

Schedule

8:30 - 9:00 am Pacific

Arrive + Register + Breakfast

9:00 - 9:10 am Pacific

Opening Remarks

9:10-9:40 am Pacific

Masashi Sugiyama: "Can We Estimate the Bayes Error Accurately?"

The Bayes error represents the lowest possible error achievable by any classifier. Accurately estimating the Bayes error provides valuable insight into the intrinsic difficulty of a classification task—revealing whether there is room for performance improvement or if overfitting to the test data is occurring. In this talk, I will present an overview of our recent advances in Bayes error estimation. Our proposed methods estimate the Bayes error directly from soft labels, without relying on data instances, features, or classifiers.

9:40-10:10 am Pacific

Jamie Shotton: "Frontiers in Embodied AI for Autonomous Driving"

Over the last decade, we've seen unprecedented progress in AI across many disciplines and applications. However, autonomous vehicles are still far from mainstream even after billions of dollars of investment. In this talk we’ll explore what’s been holding progress back, and how by adopting a modern embodied AI approach to the problem, Wayve is finally unlocking the potential of scalable autonomous driving across the globe.

We’ll also explore some of our latest research in multimodal learning to combine the power of large language models with the driving problem, and in controllable generative world models as learned simulators.

10:10 - 10:30 am Pacific

Break + Refreshments

10:30-11:00 am Pacific

Arash Vahdat: "On the Limitations of Generative Diffusion Models"

Diffusion models have achieved remarkable success in generating high-quality and diverse outputs, supported by stable and scalable training procedures. However, when applied to real-world scenarios, they continue to exhibit key limitations. In this talk, I will examine several of these fundamental challenges, including slow sampling, inadequate modeling of distributional tails, and inefficiencies in training. I will also present recent advances aimed at mitigating these issues, focusing on techniques such as accelerated sampling via trajectory and distribution matching objectives, as well as improved diffusion processes designed specifically for video generation. The talk will conclude with a discussion of emerging requirements for generative models, particularly their ability to capture rare events and heavy-tailed data distributions.

11:00-11:20 am Pacific

Lightning Talks 1

11:20am-12:00pm Pacific

Poster Session 1

12:00-1:00 pm Pacific

Lunch

1:00-1:30 pm Pacific

Qiang Yang: "Federated Learning meets Large Language Models"

TBD

1:30-1:50 pm Pacific

Lightning Talks 2

1:50-2:30pm Pacific

Poster Session 2

2:30 - 3:00 pm Pacific

Break + Refreshments

3:00-3:30 pm Pacific

Kelsey Allen: "Learning and using World Models for Physical Reasoning"

TBD

3:30-4:00 pm Pacific

Alane Suhr: "The Role of Joint Embodiment in Situated Language-Based Interactions"

Large-scale pretraining has become the standard solution to automated reasoning over text and/or visual perception. But how far does this approach get us to systems that generalize to language use in realistic multi-agent situated interactions? First, I will talk about existing work in evaluating the spatial and compositional reasoning capabilities of current multimodal language models. Then I will talk about how these benchmarks miss a key aspect of real-world situated interactions: joint embodiment. I will discuss how joint embodiment in a shared world supports perspective-taking, an underlooked aspect of situated reasoning, and introduce a new environment and benchmark for studying the influence of perspective-taking on language use in interaction. I will also describe experiments in learning from communicative success in jointly-embodied interactions with human users.

4:00-4:45 pm Pacific

Panel Discussion

TBD

4:45-5:00 pm Pacific

Bonus / Overflow Time for Conversation

TBD

Organizers

Wuyang Chen (SFU), Evan Shelhamer (UBC + Vector Institute), Johannah Thumb (Vector Institute),