📍🔗 515 W Hastings St, Vancouver, BC V6B 4N6
right beside the Harbour Centre, Downtown Vancouver
(10-min walk from the Vancouver Convention Centre)
Registration is closed—see you on Jul. 14!
We are excited to introduce the Vancouver Vision & Learning Workshop @ ICML 2025! Our workshop is hosted jointly by Simon Fraser University (SFU), the University of British Columbia (UBC), and the Vector Institute, three leading institutions at the forefront of machine learning research. This workshop brings together researchers, students, and engineers to explore the latest developments in computer vision, machine learning, language, and more, with a focus on fostering collaboration and innovation across academia and industry.
Hosted during ICML 2025, our workshop reflects the vibrant learning ecosystem in Vancouver. SFU and UBC researchers have long contributed foundational work across areas such as visual computing, language modeling, and reinforcement learning, and this event will highlight these topics. Attendees can expect a program of invited talks from world-class speakers, cutting-edge research presentations, and opportunities for conversation and socializing over lunch and breaks. We welcome participants from the global ICML community to join us in Vancouver for this exciting addition to the conference week.
Masashi Sugiyama: "Can We Estimate the Bayes Error Accurately?"
The Bayes error represents the lowest possible error achievable by any classifier. Accurately estimating the Bayes error provides valuable insight into the intrinsic difficulty of a classification task—revealing whether there is room for performance improvement or if overfitting to the test data is occurring. In this talk, I will present an overview of our recent advances in Bayes error estimation. Our proposed methods estimate the Bayes error directly from soft labels, without relying on data instances, features, or classifiers.
Jamie Shotton: "Frontiers in Embodied AI for Autonomous Driving"
Over the last decade, we've seen unprecedented progress in AI across many disciplines and applications. However, autonomous vehicles are still far from mainstream even after billions of dollars of investment. In this talk we’ll explore what’s been holding progress back, and how by adopting a modern embodied AI approach to the problem, Wayve is finally unlocking the potential of scalable autonomous driving across the globe.
We’ll also explore some of our latest research in multimodal learning to combine the power of large language models with the driving problem, and in controllable generative world models as learned simulators.
Arash Vahdat: "On the Limitations of Generative Diffusion Models"
Diffusion models have achieved remarkable success in generating high-quality and diverse outputs, supported by stable and scalable training procedures. However, when applied to real-world scenarios, they continue to exhibit key limitations. In this talk, I will examine several of these fundamental challenges, including slow sampling, inadequate modeling of distributional tails, and inefficiencies in training. I will also present recent advances aimed at mitigating these issues, focusing on techniques such as accelerated sampling via trajectory and distribution matching objectives, as well as improved diffusion processes designed specifically for video generation. The talk will conclude with a discussion of emerging requirements for generative models, particularly their ability to capture rare events and heavy-tailed data distributions.
Alane Suhr: "The Role of Joint Embodiment in Situated Language-Based Interactions"
Large-scale pretraining has become the standard solution to automated reasoning over text and/or visual perception. But how far does this approach get us to systems that generalize to language use in realistic multi-agent situated interactions? First, I will talk about existing work in evaluating the spatial and compositional reasoning capabilities of current multimodal language models. Then I will talk about how these benchmarks miss a key aspect of real-world situated interactions: joint embodiment. I will discuss how joint embodiment in a shared world supports perspective-taking, an underlooked aspect of situated reasoning, and introduce a new environment and benchmark for studying the influence of perspective-taking on language use in interaction. I will also describe experiments in learning from communicative success in jointly-embodied interactions with human users.
Wuyang Chen (SFU), Evan Shelhamer (UBC + Vector Institute), Johannah Thumb (Vector Institute),
Martin Ester (SFU), Ke Li (SFU), Leon Sigal (UBC + Vector Institute), Angel Chang (SFU + Amii)