Xingdou Fu
Omron
Title: Fast Active Sensing Framework for Picking Under Challenging Conditions
Abstract:
Bin picking under challenging conditions—such as metallic objects, clutter, and heavy occlusion—might require multi-view sensing and active sensing (e.g., next best view for fusion or better estimation of affordance or grasp) for robustness. Utilizing multiple views, especially with a one-shot 3D sensor and “sensor-on-hand” configuration, is gaining popularity. However, moving the 3D sensor to acquire multiple views suffers from low-speed issues. Starting with an industrial bin picking scenario, we designed a bin picking system that tightly couples a multi-view, active vision scheme with motion tasks. It not only speeds up the system by parallelizing the high-speed sensing scheme with the robot’s place action, but also decides the next sensing path to maintain the continuity of the whole picking process. As an extension, we found many similarities with bin picking in other scenarios, such as picking in clutter for warehouses or daily life, especially when time is an important factor. We will explore the possibility of adapting the sensing scheme to broader applications.
Short bio:
Xingdou is the Chief Technology Specialist at the OMRON Innovation Center in Kyoto, Japan, where he leads R&D, strategic planning, and international collaborations in 3D sensing and robot vision. With over 11 years of experience at OMRON, he has focused on both basic and applied research in 3D sensors, active vision, pose estimation, and SLAM. His work has led to several patents and contributed to innovative products and prototypes across factory automation, healthcare, and driver monitoring systems. Xingdou obtained his Ph.D. in Computer Vision in 2009 from Huazhong University of Science and Technology, with joint training at the University of Illinois Urbana-Champaign. Before joining OMRON, he held various academic positions in Hong Kong SAR and Japan, focusing on areas such as gaze estimation and 3D endoscopic scanning.
Anh Nguyen
University of Liverpool
Title: Language-driven Grasping
Abstract:
Robotic manipulation in unstructured environments remains a core challenge, especially when driven by natural language instructions like “grab the red mug by the handle”. This talk explores recent progress in language-driven grasping, where robots learn to map verbal commands to physical interactions. The talk will highlight emerging datasets and models that integrate vision, language, and action, enabling more flexible and generalizable grasping. Key techniques include diffusion-based grasp prediction, contrastive learning, and mask-guided attention. Finally, open challenges and future directions toward more capable and communicative embodied agents will be discussed.
Short bio:
Anh Nguyen is an Associate Professor and directs the Smart Robotic Lab at the Department of Computer Science, University of Liverpool. He received his Ph.D. from the Italian Institute of Technology (IIT), Italy. Previously, he worked at Imperial College London, Australian National University, Inria, and the University of Adelaide. He is an Associate Editor for IEEE Transactions on Medical Robotics and Bionics. His research goal is to develop methods that enable robots to see and act like humans.
Minas Liarokapis
University of Auckland
and ACUMINO
Title: Human to Robot Skill Transfer for Reliable, Dexterous In-Hand Manipulation: From Industrial Assembly to Home Robotics
Abstract:
In this talk, we explore the frontier of human-to-robot skill transfer for reliable, dexterous in-hand and bimanual manipulation, a key enabler for deploying intelligent robotic systems across domains—from industrial assembly lines to everyday home environments. While traditional automation has thrived in highly structured settings, enabling robots to perform complex tasks in unstructured and dynamic environments remains a significant challenge. At Acumino, we address this gap by leveraging immersive human-in-the-loop demonstrations, scalable skill learning, and AI-powered robot training pipelines that integrate vision, force feedback, and multimodal data. We present a generalizable framework that transforms human expertise into reusable robot policies, enabling robotic arms and humanoid platforms to acquire fine motor skills with minimal task-specific programming. Case studies include rapid deployment in manufacturing, material handling, and assistive service tasks in the home. The talk will highlight the opportunities and constraints in bridging human intuition with robotic precision, ultimately enabling a new generation of robots to operate safely, skillfully, and autonomously in real-world settings.
Short bio:
Minas Liarokapis is the CEO / CTO of Acumino (www.acumino.ai), an Honorary Associate Professor in the Department of Mechanical and Mechatronics Engineering at the University of Auckland, and Director of the New Dexterity group (www.newdexterity.org). Previously, he was a Lecturer / Senior Lecturer in the same Department and a Postdoctoral Associate in the GRAB Lab at Yale University. Minas has 15 years of experience working on robot grasping and dexterous manipulation, human robot interaction, and human machine interfaces.
He is the founder of the Open Bionics (www.openbionics.org) initiative and a co-founder of Open Robot Hardware (www.openrobothardware.org), Hand Corpus (www.handcorpus.org), and HumanDataCorpus (www.humandatacorpus.org). His research has received multiple prestigious awards in international conferences and competitions.
Luis Figueredo
University of Nottingham
Title: Beyond the Grip: Safety and Comfort in Physical Human-Robot Collaborative Grasping and Manipulation
Abstract:
Recent breakthroughs in robotics have significantly narrowed the gap between humans and robots. However, physical close interactions required for effective collaborative grasping and manipulation remain largely underdeveloped—particularly when safety and comfort are taken into account within multimodal communication. Achieving human-like collaboration demands seamless communication and a mutual understanding of each other’s abilities—qualities essential for any safe and effective teamwork. To enable this, robots must not only interpret and respond to human natural language, but also develop a nuanced understanding of human physical capabilities and expectations, safety, ergonomics, and a shared sense of embodiment. This requires integrating responsive, reactive, and safety-certified functionalities into AI-driven multimodal interactions through intuitive language interfaces, ergonomics, and biomechanics-informed interaction. In this talk, I will highlight the key elements of such a comprehensive, safe, and functional approach, the tools that can make humans and robots more comfortable around each other, and the roles of benchmarks and competitions to catalyse the progress in this field.
Short bio:
Dr. Luis Figueredo is an Assistant Professor at the University of Nottingham, UK. During his PhD at both the University of Brasilia, Brazil, and the Massachusetts Institute of Technology (MIT), he received multiple awards, including the Best Ph.D. Thesis and awards for outstanding robot demos at major conferences such as IROS and ICAPS. His pioneering work on biomechanics-aware manipulation planning earned him the prestigious Marie Skłodowska-Curie Individual Fellowship at the University of Leeds, where he also developed open-source AI tools—featured by the EU Innovation Radar. Dr. Figueredo has further led contributions to large-scale projects such as the Geriatronics Lighthouse Initiative at the Technical University of Munich (TUM), where he was named the first Associated Fellow at the Munich Institute of Robotics and Machine Intelligence (MIRMI). Recently recognized as an IEEE ICRA New Generation Star at ICRA-24, his interdisciplinary research spans physical human-robot interaction, multimodal interfaces grounded in manipulation constraints, cooperative robotics, and geometric methods for reactive and safe robot grasping and manipulation.
Xiang Li
Winning team
Picking in Clutter
Title: Towards Model-Free Universal Dexterous Grasping
Abstract:
This talk presents our championship-winning solution for the Picking from Clutter track at the Robotic Grasping and Manipulation Competition (RGMC) in ICRA 2025, which integrates perception, planning, and a custom-designed end-effector to address dense clutter challenges. Our pipeline employs a hybrid gripper to handle diverse objects, dynamically switching tools based on object categories and spatial contexts. For perception, we combine YOLOv8 for robust detection of occluded objects and MobileSAM for real-time segmentation, enabling precise point cloud extraction. Grasping pose generation leverages AnyGrasp and SuctionNet to generate grasping poses end-to-end, while a heuristic decluttering strategy prioritizes occlusion-prone objects using size, depth, and mask metrics. The aforementioned lays the foundation for our future work on model-free universal dexterous grasping.
Short bio:
Xiang Li is an Associate Professor with the Department of Automation, Tsinghua University. His research interests include robotic dexterous manipulation and human-robot collaboration. He has been the Associate Editor of IJRR since 2024 and the Associate Editor of TRO since 2025. He received the IROS Best Application Paper Finalist in 2017, the ICRA Best Medical Robotics Paper Finalist in 2024, and the RAL Outstanding Associate Editor in 2025. He led the team and won first place at the 2024 ICRA Robotic Grasping and Manipulation Challenge – in-hand track and also the Most Elegant Solution across all the tracks, as well as first place at the 2025 ICRA Robotic Grasping and Manipulation Challenge – picking from clutter track.
Yik Lung Pang
Queen Mary University of London
Title: Stereo Hand-Object Reconstruction for Human-to-Robot Handover
Abstract:
Jointly estimating hand and object shape facilitates the grasping task in human-to-robot handovers. Relying on hand-crafted prior knowledge about the geometric structure of the object fails when generalising to unseen objects, and depth sensors fail to detect transparent objects such as drinking glasses. In this work, we propose a method for hand-object reconstruction that combines single-view reconstructions probabilistically to form a coherent stereo reconstruction. We learn 3D shape priors from a large synthetic hand-object dataset, and use RGB inputs to better capture transparent objects. We show that our method reduces the object Chamfer distance compared to existing RGB based hand-object reconstruction methods on single view and stereo settings. We process the reconstructed hand-object shape with a projection-based outlier removal step and use the output to guide a human-to-robot handover pipeline with wide-baseline stereo RGB cameras. Our hand-object reconstruction enables a robot to successfully receive a diverse range of household objects from the human.
Short bio:
Yik Lung Pang is a Research Assistant at Queen Mary University of London. His research interests include human-robot interaction and computer vision for robotics. Previously, he was a PhD student at Queen Mary University of London working with Dr. Changjae Oh and Prof. Andrea Cavallaro on hand-object reconstruction from images and human-to-robot handover. He received the MSc in Artificial Intelligence from Queen Mary University of London and the BSc in Physics from Imperial College London.
Xidan Zhang
Winning team
In hand manipulation
Title: Manipulating objects in complex scenarios: Competition-Scale Solutions for In-Hand Reconfiguration and Cluttered Picking
Abstract:
From reorienting objects mid-air to locating buried targets in cluttered bins, robots are now expected to handle tasks under real-world constraints with increasing autonomy and precision. In this talk, I present our team's solutions to two challenges from the 2025 Robotic Grasping and Manipulation Competition. In the In-Hand Manipulation track, we solve object translation via closed-loop trajectory optimization and handle rotation using RL-trained skill primitives, transferred from simulation to the real world without fine-tuning. For the cluttered picking task, we design a multimodal soft gripper aimed at achieving robust performance across a wide range of object geometries. On the software side, our system combines instance-level perception with a rearrangement policy that leverages object position storage and occlusion relationships to determine the necessary pre-grasp actions.
Short bio:
Xidan Zhang is a first-year Master's student in the GRASP Lab at the Department of Mechanical Engineering, Zhejiang University, supervised by Dr. Huixu Dong. She received her B.Eng. degree in Automation from Harbin Institute of Technology in June 2025. Her research interests lie in dexterous manipulation and reinforcement learning, with a particular focus on developing generalizable and robust manipulation policies for real-world deployment. In May 2025, she and her teammates won first place in the In-Hand Manipulation Track and second place in the Picking from Clutter Track of the Robotic Grasping and Manipulation Competition at ICRA 2025.