Simon Fraser University @ ICML 2025

9am-noon, July 21st, 2025

Segal Conference Rooms 1420-1430 (@1st floor)

📍🔗515 W Hastings St, Vancouver, BC V6B 4N6

[right beside the Harbour Centre, Downtown Vancouver]

(10-min walk from Vancouver Convention Centre)

Researchers across Simon Fraser University (SFU) are dedicated to advancing the field of machine learning (ML), spanning from innovative applications to foundational theory. SFU’s work includes groundbreaking research in areas such as visual computing, language processing, reinforcement learning, and beyond. By publishing extensively, collaborating across disciplines, and engaging with the global ML community, SFU aims to foster a rich, collaborative ecosystem that pushes the boundaries of what's possible in ML.

SFU VINCI Institute (Visual and INteractive Computing): https://vinci.sfu.ca/

Machine Learning Team @ SFU: https://ml.cs.sfu.ca/

GrUVi (Graphics U Vision) Team @ SFU: https://gruvi.cs.sfu.ca/

Continued from past years, SFU is proud to host a parallel event along with ICML. This special SFU event will feature invited talks and social events designed to complement the ICML experience. We warmly invite ICML attendees to join us for these sessions, connecting with our speakers and researchers, and engaging in thought-provoking discussions about the latest advances in AI/ML.

Speakers (A-Z by Last Name)

Bernadette Bucher

University of Michigan

Yunzhu Li

Columbia University

Ludwig Schimdt

Stanford University, Anthropic

Max Simchowitz

CMU

Schedule

8:50-9:0 am Pacific

Opening Remarks

9:00-9:30 am Pacific

Slides: Link

Max Simchowitz: "Why AI is harder in the physical world, and what to do about it."

AI has seen tremendous advances in discrete and symbolic reasoning. And yet, reliable AI in the physical world - from robotics, to autonomous driving, to accurate weather prediction to fully automated smart grid - still pose major challenges.

Beyond pragmatic challenges like the availability of data and the costs of operating on physical hardware, this talk will argue that there are unique and fundamental challenges when getting machine learning methods to work on physical systems. Using robotic imitation learning as a didactic example, we will demonstrate mathematical results that reveal error propagation through dynamic environments can lead to exponentially greater data requirements than what we might expect in discrete domains.

On the positive side, we show how certain remedies, like the hitherto-mysterious practice of “action-chunking,” partially surmount these difficulties, and that a mixture of “imperfect” and “expert” data is strictly better than expert data alone. Yet, despite these solutions, we will conclude by describing some preliminary work suggesting that there are still major empirical limitations of the state of art methods in continuous control, and will outline some of the most promising directions to overcoming them.

Relevant Work:

“The Pitfalls of Imitation Learning when Actions are Continuous, ” Max Simchowitz, Daniel Pfrommer, Ali Jadbabaie, COLT 2025. ArXiv: https://arxiv.org/abs/2503.09722

The Pitfalls of Imitation Learning when Actions are Continuous;” Thomas Zhang, Daniel Pfrommer, Nikolai Matni, Max Simchowitz. ArXiv: https://arxiv.org/abs/2503.09722

9:40-10:10 am Pacific

Slides: Link

Bernadette Bucher: "Building Visual Representations with Foundation Models for Mobile Manipulation"

Rapid improvements over the past few years in computer vision have enabled high performing geometric state estimation on moving camera systems in day-to-day environments. Furthermore, recent substantial improvements in language understanding and vision-language grounding have enabled rapid advancements in semantic scene understanding. In this presentation, I will demonstrate how we can build visual representations from these foundational vision-language models to enable new robotic capabilities in navigation, manipulation, and mobile manipulation. I will also discuss new robotics research directions opened up by these advancements in vision-language understanding and alternative directions to building foundation models for robotics.

10:10-10:50 am Pacific

Poster Session

10:50-11:20 am Pacific

Yunzhu Li: "Learning Structured World Models From and For Physical Interactions "

Humans have a strong intuitive understanding of the physical world. Through observations and interactions with the environment, we build a mental model that predicts how the world would change if we applied a specific action (i.e., intuitive physics). My research draws on insights from humans and develops model-based reinforcement learning (RL) agents that learn from their interactions and build predictive models that generalize widely across a range of objects made with different materials. The core idea behind my work is to introduce novel representations and incorporate structural priors (e.g., particle- and graph-based neural dynamics models) into learning systems to better model the dynamics of diverse deformable objects. I will discuss how such structures can make model-based planning algorithms more effective and help robots accomplish complicated manipulation tasks (e.g., manipulating an object pile, shaping deformable foam into a target configuration, and making a dumpling from the dough using various tools). Furthermore, I will demonstrate our recent progress in constructing neural and physics-informed digital twins that jointly capture appearance, geometry, and dynamics, with significant implications for data generation and evaluation in the development of robot learning systems.

11:30am-noon Pacific

Ludwig Schmidt: "DataComp-LM: In Search of the Next Generation of Language Model Training Sets"

Curating high-quality data is key for training language models. In this talk, I will present DataComp-LM (DCLM), a large collaboration to build a fully open state-of-the-art pre-training dataset for language models. We will go over each step in the data curation pipeline and highlight the most important parts. For training 7B parameter language models, our DCLM-Baseline training set performs similar to Llama 3 8B on an average of 53 natural language understanding tasks while being trained with 6x less compute.

noon-12:10 pm Pacific

Closing Remarks

Organizers

Wuyang Chen Angel Chang Ke Li Manolis Savva Martin Ester Oliver Schulte

Page updated

Google Sites

Report abuse