Abstract: In this talk, I’ll cover our prior work in capturing and annotating datasets of unscripted food prep, including annotating causality, recipes, ingredients, interactions, kitchen fixtures and digital twins. I will also discuss our recent works on (1) generating steps in a recipe from an initial image and step descriptions, (2) predicting collaborations in food prep and (3) detecting hand-object interactions from in-the-wild images.
Dr. Dima Damen is a Professor of Computer Vision at the University of Bristol and Senior Research Scientist at Google DeepMind. Dima is currently an EPSRC Fellow (2020-2026), focusing her research interests in the automatic understanding of object interactions, actions and activities using wearable visual (and depth) sensors. She is best known for her leading works in Egocentric Vision, and has also contributed to novel research questions including mono-to-3D, video object segmentation, assessing action completion, domain adaptation, skill/expertise determination from video sequences, discovering task-relevant objects, dual-domain and dual-time learning as well as multi-modal fusion using vision, audio and language. She is the project lead for EPIC-KITCHENS, the seminal dataset in egocentric vision, with accompanying open challenges and follow-up works: EPIC-Sounds, VISOR and EPIC Fields, as well as the recent HD-EPIC. She is part of the large-scale consortium effort Ego4D and Ego-Exo4D. She is an ELLIS Fellow, associate editor (AE) of IJCV, and was a program chair for ICCV 2021 and Associate Editor-in-Chief (AEIC) of IEEE TPAMI (2023-2025). She is frequently a Senior Area Chair and Area Chair in major conferences and was selected as Outstanding SAC in ECCV 2024 and Outstanding Reviewer in CVPR2021, CVPR2020, ICCV2017, CVPR2013 and CVPR2012. Dima received her PhD from the University of Leeds (2009), joined the University of Bristol as a Postdoctoral Researcher (2010-2012), Assistant Professor (2013-2018), Associate Professor (2018-2021) and was appointed as chair in August 2021. She supervises 9 PhD students and 3 Visiting PhD students.
Title: From Kitchen to Dining: Learning to Perceive and Manipulate Food Through Interaction
Abstract: Food is one of the most challenging objects for embodied AI. Unlike rigid objects, foods vary widely in shape, texture, compliance, and physical behavior, and these properties often change over time, making successful manipulation difficult to achieve through vision alone. In this talk, I will present research from the EmPRISE Lab on learning to perceive and manipulate food through interaction, with examples spanning robotic bite acquisition, assistive feeding, and meal preparation. Through these projects, I will discuss how robots can leverage multimodal perception and interaction to reason about the physical properties of food and enable more robust and adaptive food manipulation.
Dr. Tapomayukh Bhattacharjee is an Assistant Professor in the Department of Computer Science at Cornell University where he directs the EmPRISE Lab. He completed his Ph.D. in Robotics from Georgia Institute of Technology and was an NIH Ruth L. Kirschstein NRSA postdoctoral research associate in Computer Science & Engineering at the University of Washington. He wants to enable robots to assist people with mobility limitations with activities of daily living. His work spans the fields of human-robot interaction, haptic perception, and robot manipulation and focuses on addressing the fundamental research question on how to leverage robot-world physical interactions in unstructured human environments to perform relevant activities of daily living. He is the recipient of TRI Young Faculty Researcher Award’24, NSF CAREER Award’23, and his work has won Best Paper Award Finalist at HRI’24, Best Demo Award at HRI’24, Best RoboCup Paper Award at IROS’22, Best Paper Award Finalist and Best Student Paper Award Finalist at IROS’22, Best Technical Advances Paper Award at HRI’19, and Best Demonstration Award at NeurIPS’18. His work has also been featured in many media outlets including the BBC, Reuters, New York Times, IEEE Spectrum, and GeekWire and his robot-assisted feeding work was selected to be one of the best interactive designs of 2019 by Fast Company.