Abu Dhabi-AIRoC2025

Speakers

Humanoid robotics in the era of foundation models

Abstract: Recent advances in foundation models have shown great promise in imitating teleoperation demonstrations for complex manipulation tasks. These models are built on a vision-language model (VLM) pre-trained on large-scale internet (non-robot) data and are connected to an action module that maps the output of the VLM to robot actions. While very successful, current methods mostly focus on static manipulation problems and fall short in providing a scalable path towards general-purpose humanoid loco-manipulation, mainly because it is impractical to generate a large amount of tele-operation demonstrations for humanoid robots. In my talk, I outline three main developments that, when combined, provide a scalable framework for generating large-scale data required for training humanoid VLAs. The first component is a general optimization-based task and motion planning (TAMP) framework that generates diverse strategies for achieving different tasks. To enable fast and efficient search, second component uses pre-trained VLMs to provide various manipulation sequence proposals (subgoals) given a desired task and environment. These two components together generate a detailed interaction graph between the robot and the environment. Finally, the third component is a generalist RL policy trained to realize any given desired interaction sequence and enable robust sim-to-real execution of the generated behaviours on the real robot.

Bio: Majid Khadiv is an assistant professor in the school of Computation, Information and Technology (CIT) at TUM. He leads the chair of AI Planning in Dynamic Environments and is also a member of the Munich Institute of Robotics and Machine Intelligence (MIRMI). Prior to joining TUM, he was a research scientist at the Empirical Inference Department at the Max Planck Institute for Intelligent systems. Before that he was a postdoctoral researcher in the Machines in Motion, a joint laboratory between New York University and Max Planck Institute. Since the start of his PhD in 2012, he has been performing research on motion planning, control and learning for legged robots ranging from quadrupeds, lower-limb exoskeleton up to humanoid robots.

Majid Khadiv

Professor

TUM

Sylvain Calinon

Senior Research Scientist

EPFL

Frugal learning of robot manipulation skills

Abstract: Despite significant advances in AI, robots still struggle with tasks involving physical interaction. Robots can easily beat humans at board games such as chess or Go but struggle to skillfully move the game pieces by themselves (the part of the task that humans subconsciously succeed in). Learning manipulation skills is both hard and fascinating because the movements and behaviors to acquire are tightly connected to our physical world and to embodied forms of intelligence. I will present an overview of representations and learning approaches to help robots acquire manipulation skills by imitation and self-refinement. I will present the advantages of targeting a frugal learning approach, where the term "frugality" has two goals: 1) learning manipulation skills from only few demonstrations or exploration trials; and 2) learning only the components of the skill that really need to be learned. Toward this goal, I will emphasize the roles of geometry, manifolds, implicit shape representations and distance fields as inductive biases to facilitate human-guided manipulation skill acquisition. I will also discuss how ergodic control can provide a mathematical framework to generate exploration and coverage movement behaviors, which can be exploited by robots as a way to cope with uncertainty in sensing, proprioception and motor control.

Bio: Dr Sylvain Calinon is a Senior Research Scientist at the Idiap Research Institute and a Lecturer at the Ecole Polytechnique Fédérale de Lausanne (EPFL). He heads the Robot Learning & Interaction group at Idiap, with expertise in human-robot collaboration, robot learning from demonstration, geometric representations and optimal control. The approaches developed in his group can be applied to a wide range of applications requiring manipulation skills, with robots that are either close to us (assistive and industrial robots), parts of us (prosthetics and exoskeletons), or far away from us (shared control and teleoperation

Toward Human-Humanoid Cooperative Mobility: Control and Physical Intelligence

Abstract: Humanoid robots are increasingly envisioned as capable partners for physical assistance in everyday human mobility tasks. Among these, the sit-to-stand-to-sit transition remains a particularly challenging whole-body interaction problem, requiring the robot to coordinate contact forces, balance, motion generation, and human-intention understanding in real time. I will present the key algorithmic and control ingredients that enable a humanoid robot to physically assist a human, e.g., in performing sit-to-stand-to-sit motions. We detail the integration of integrating a human mode as part of the whole-body prioritized control, compliant interaction through momentum and force regulation, estimation of human states and intentions, and the modulation of support forces during critical phases of the movement. We also discuss how these components collectively create “physical intelligence” at the human-humanoid interface, enabling adaptive and safe assistance without predefined motion scripts. I will conclude with insights on generalizing these principles toward broader human mobility assistance, shared balance control, and future humanoid capabilities in physical human-robot interaction.

Bio: Abderrahmane Kheddar received his B.S. in Computer Science from the Institut National d’Informatique (now ESIE), Algiers, in 1990, and his M.Sc. and Ph.D. in Robotics from Université Pierre et Marie Curie (Paris 6), now Paris Sorbonne University in 1993 and 1997, respectively. He is Directeur de Recherche at the Centre National de la Recherche Scientifique (CNRS). He created and was the Director of the CNRS-AIST Joint Robotic Laboratory (IRL 3218) in Tsukuba, Japan (2008 - 2021). In France, he created and led the Interactive Digital Humans team at LIRMM, University of Montpellier (2010-2020); since 2020, he is leading the Bionics Platform @CARTIGEN, University Hospital Montpellier. His research interests span humanoid robotics, multi-contact motion and interaction, haptics, brain-machine interfaces, and human-robot collaboration. Professor Kheddar is a Fellow of the IEEE, the Asian Control Association, the AAIA, a member of the National Academy of Technology of France, and a Knight of the National Order of Merit of France. He has served as Associate Editor and Editor for leading robotics journals, including IEEE Transactions on Robotics, IEEE Robotics and Automation Letters, IEEE Transactions on Haptics, etc. He also contributes to several IEEE RAS committees and the IEEE Brain Initiative, and has chaired and co-chaired numerous major robotics conferences and workshops. He was William M.W. Mong distinguished lecturer at the University of Hong Kong in May 2024.

Abderrahmane Kheddar

Research Director

CNRS, University of Montpelliler

Sethu Vijayakumar

Professor

University of Edinburgh

Embodied AI driving the Future of Assistive Technologies

Abstract: Latest advances in Machine Learning and Artificial Intelligence has turbocharged the development, testing and deployment of embodied systems such as humanoid robotics, exoskeletons and assistive systems for daily living. In my talk, I will aim to separate the hype from reality, focusing on key enablers like representational learning, variable impedance actuation and sensorimotor learning and adaptation that is driving this revolution. At the same time, I will focus on yet unsolved and difficult problems that are crucial for scaling these systems to become safe, economically viable and ubiquitous.

Bio: Sethu Vijayakumar is the Professor of Robotics at the University of Edinburgh, UK and the Founding Director of the Edinburgh Centre for Robotics. He has pioneered the use of large-scale machine learning techniques in the real-time control of several iconic robotic platforms such as the SARCOS and the HONDA ASIMO humanoids, KUKA-LWR robot arm and iLIMB prosthetic hand. He has held adjunct faculty positions at the University of Southern California (USC), Los Angeles and the RIKEN Brain Science Institute, Japan. One of his landmark projects (2016) involved a collaboration with NASA Johnson Space Centre on the Valkyrie humanoid robot being prepared for unmanned robotic pre-deployment missions to Mars. Professor Vijayakumar, who has a PhD from the Tokyo Institute of Technology, holds the Royal Academy of Engineering (RAEng) - Microsoft Research Chair at Edinburgh. He has published over 250 peer-reviewed and highly cited articles [H-index 52, Citations > 14,000 as of 2025] on topics covering robot learning, optimal control, and real-time planning in high dimensional sensorimotor systems. He is a Fellow of the Royal Society of Edinburgh, a judge on BBC Robot Wars and winner of the 2015 Tam Dalyell Prize for excellence in engaging the public with science. Professor Vijayakumar helps shape and drive the UK Robotics and Autonomous Systems (RAS) agenda in his recent role as the Programme Director for Robotics and Human AI Interfaces at The Alan Turing Institute, the UK’s national institute for data science and AI.

DLR Humanoids, from Earth to the Moon.

Abstract: Robots are not only machines which are supposed to relieve humans from dangerous or routine work – they are also a scientific endeavour attempting to better understand human and animal motion and intelligence in a synthetizing way, by using the system analytic tools of engineering and computer science. The exploding commercial interest in humanoids in the last two years, with billions of dollars of investment and a large number of companies building such robots, is definitely a hype in the short run. On the other hand, the huge investment of resources and talents induced a transformation in robotics which is here to stay, and which will transform our society massively in the long run. From mechatronics and control standpoint, humanoids became a quite mature technology during the last years, and still, the development in this field continues at a high pace of innovation. The convergence of these developments with rapidly evolving artificial intelligence techniques has created the foundation for a new generation of cognitive, adaptive, multi-purpose machines - commonly referred to as Physical AI or Embodied Intelligence. or AI-Powered Robotics. In this talk, I will give an overview on humanoid robot developments at DLR, covering the entire bandwidth from design, control, over perception and cognition, up to several application examples in space and on earth.

Bio: Alin Albu-Schäffer received his M.S. in electrical engineering from the Technical University of Timisoara, Romania in 1993 and his Ph.D. in automatic control from the Technical University of Munich in 2002. Since 2012 he is the head of the Institute of Robotics and Mechatronics at the German Aerospace Center (DLR). Moreover, he is a professor at the Technical University of Munich, holding the Chair for "Sensor Based Robotic Systems and Intelligent Assistance Systems" at the School of Computation, Information and Technology. His personal research interests include robot design, modeling and control, nonlinear control, flexible joint and variable compliance robots, impedance and force control, physical human-robot interaction, bio-inspired robot design and control. He received several awards, including the IEEE King-Sun Fu Best Paper Award of the Transactions on Robotics in 2012 and 2014; several ICRA and IROS Best Paper Awards as well as the DLR Science Award. He was strongly involved in the development of the DLR light-weight robot and its commercialization through technology transfer to KUKA. He is the coordinator of euROBIN, the European network of excellence on intelligent robotics, IEEE Fellow and RAS-AdCom member.

Alin Albu-Schäffer

Professor

DLR, TUM

Fan Shi

Assistant Professor

NUS

Synthetic Data for Humanoid Learning: Wins, Fails, and the Next

Abstract: Robot learning fundamentally depends on access to abundant, high-quality, and low-cost data. Humanoid robots present unique challenges and opportunities—combining locomotion over rigid terrains (mostly) with manipulation of diverse, often deformable, objects. While synthetic data has driven remarkable progress in locomotion through deep reinforcement learning, manipulation remains limited by data scarcity and simulation fidelity.

In this talk, I will discuss our recent advances in simulation technology inspired by our breakthroughs in computer graphics, aimed at enabling more effective humanoid learning for complex loco-manipulation tasks. Our new simulation engine delivers over 100× improvements in both speed and accuracy for deformable object dynamics, unlocking a wide range of contact-rich tasks previously deemed infeasible. I will conclude by outlining how these advances may shape the next frontier of humanoid intelligence, where realistic synthetic data bridges the gap between simulation and the real world.

Bio: Fan Shi is an Assistant Professor in the Department of Electrical and Computer Engineering at NUS, where he holds the prestigious NUS Presidential Young Professorship. His research focuses on AI for robotics, with particular interests in physical simulation and robot learning. He has received several international recognitions, including awards and support from leading organizations such as the NVIDIA Academic Grant Program, Google Research Funding, and Swiss AI Initiative. Before joining NUS, he was a Postdoctoral Researcher at ETH Zurich. He earned his Ph.D. and M.S. degrees at the University of Tokyo, and his B.S. degree at Peking University.

The quest for human-level physical intelligence

Abstract: The field of robotics is advancing at an unprecedented pace, driven by promising applications in manufacturing, logistics, healthcare and personal assistance. Humanoid robots, in particular, are poised to take center-stage in our society in the near future. However, before they can live up to this promise, robots must be able to perform a wide range of everyday tasks with skill, adaptability and autonomy. While recent breakthroughs in machine learning and control have enabled impressive capabilities, scalable autonomy demands the ability to learn from humans, acquiring and refining skills just as a person would when training for a new job. In this talk, I will discuss our recent progress and open challenges in enabling robots to learn from the limited and often imperfect demonstrations humans can provide.

Bio: Prof. Stelian Coros is the director of the Computational Robotics Lab (CRL), a research group within the Institute for Intelligent Interactive Systems at ETH Zurich. His research bridges the fields of Computer Graphics, Robotics and Computational Fabrication. He draws insights from computer science, applied mathematics and control theory to establish the foundations for algorithms that address a variety of computational problems in robotics. Applications of his work range from studying the principles of dexterous manipulation and legged locomotion to computation-driven design for novel types of robots.

Stelian Coros

Associate Professor

ETH Zurich

Danfei Xu

Assistant Professor

Georgia Tech

Human Data as a Foundation for Generalist Robots

Abstract: The foundation of modern AI is scalable knowledge transfer from humans to machines. While Computer Vision and NLP can glean from exabytes of human-generated data on the Internet, Robot Learning still heavily relies on resource-intensive processes such as teleoperation. Can we capture how humans interact with the physical world as effortlessly as the Internet captures the virtual world? We propose that leveraging human data is a crucial step toward this future. Just as the Internet evolved into an unintentional data repository for AI, we envision systems that effortlessly capture rich embodied experiences from human activities, without humans’ conscious participation. In this talk, I will present a line of research on enabling robots to learn from egocentric human data. I will conclude by sharing our vision of human-centric robot learning, where machines can better understand and interact with humans and human environments by taking a human perspective.

Bio: Danfei Xu is an Assistant Professor in the School of Interactive Computing at Georgia Tech and a researcher at NVIDIA AI. His research focuses on developing machine learning methods for robotics, with particular emphasis on manipulation planning and imitation learning. His work has received Best Paper Awards and nominations at venues including CoRL and ICRA, and he is a recipient of the CoRL Early Career Award and the NSF CAREER Award.

Robust, Social, and Superhuman: Advances in Humanoid Intelligence

Abstract: Humanoid robotics is rapidly evolving beyond isolated capabilities—locomotion, manipulation, and navigation—toward integrated, intelligent agents that reason about terrain robustness, whole-body morphology, and safe and socially compliant interaction within complex environments. This talk presents several complementary strands of recent work that collectively advance a vision of learning-driven, socially compliant, and morphologically augmented humanoid autonomy. I will first introduce a novel Learn-to-Teach reinforcement learning (RL) co-training framework, which unifies teacher and student policy learning, and Opt2Skill, which combines trajectory optimization with RL to enable dynamic, contact-rich loco-manipulation. I will then highlight two emerging directions: an emotion-aware social navigation framework that balances whole-body motion feasibility with pedestrian comfort, and augmenting humanoids with supernumerary robotic limbs, which explores morphological augmentation to expand the humanoid’s action space through coordinated multi-arm control.

Bio: Ye Zhao is an Associate Professor at The George W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology. He was a Postdoctoral Fellow at Harvard and received his Ph.D. degree from UT Austin in 2016. At Georgia Tech, he leads the Laboratory for Intelligent Decision and Autonomous Robots. His research interest focuses on planning, learning, and decision-making of highly dynamic and contact-rich robots. He received the NSF CAREER Award, ONR Young Investigator Program Award, Woodruff Faculty Fellow, and Woodruff School Faculty Research Award. He serves as an Associate Editor of T-RO, TMECH, RA-L, and L-CSS. His co-authored work has received multiple paper awards at ICRA and NeurIPS.

Ye Zhao

Associate Professor

Georgia Tech

Christian Ott

Professor

Technical University of Wienna

The benefit of analytic whole-body control for agile and learning-based humanoid locomotion

Abstract: Humanoid robotics is currently receiving a strong attention based on large investments of big tech companies. The envisioned application areas range from well controlled industry environments to unstructured household environments. In order to act reliably in many practical situations, these systems need to be robust, reactive, and compliant in their interaction with humans and their environment. In this talk I will give some examples on our current research, which aims at providing humanoid robots with the necessary motion skills to perform a wide range of everyday tasks. We demonstrate how learning-based locomotion can benefit from a tight interaction with analytic model-based whole-body control. Particular focus is put on the achievement of faster execution speeds while handling variable contact situations and contact transitions, which is needed so that humanoid robots can serve as effective and interactive assistants for our future societies.

Bio: Christian Ott currently is Full Professor for Robotics at TU Wien, Vienna, Austria. He received his Dipl.-Ing. degree from the University of Linz, Austria, in 2001 and the Dr.-Ing. degree in control engineering from Saarland University, Saarbruecken, Germany, in 2005. From 2001 to 2007, he was working as a researcher at the German Aerospace Center (DLR), Wessling, Germany. From 2007 to 2009, he was a Project Assistant Professor at the Department of Mechano-Informatics, University of Tokyo, Japan. After that he has been a team leader and was head of the department for “Analysis and Control of Advanced Robotic Systems” in the Institute of Robotics and Mechatronics at DLR. He has served as Associate Editor for the IEEE Transactions on Robotics (TRO), was Co-Editor-in-Chief for IFAC Mechatronics, and is currently serving as Senior Editor for the International Journal of Robotics Research (IJRR). Since 2024, he is Editor-In-Chief of the ICRA Conference Editorial Board. He has been involved in the Organizing Committees of several international conferences and was General Chair of Humanoids 2020 in Munich, Germany. He is IEEE Fellow since 2023. He serves as IFAC Council Member for the triennial 2024-2026. His research interests include nonlinear robot control, elastic robots, whole-body control, impedance control, and control of humanoid robots.

Do autonomous agentic humanoids find the sense of self?

Abstract: Permeation of Agentic AI is in every detail of human-computer systems and changing the soft infrastructure of the society. Large Language Models trained using the vast document data can translate, rephrase, paraphrase the request, ask for missing information, suggest approaches and solutions. The psychological and emotional nuance can be computed once it is expressed by text or patterns of multimodality. The physical constraint of space and time is computable in the same framework. Humanoids are an ideal embodiment of Agentic AI because it is consistent with most data about the human, using which once trained they are universal at every situation where a human needs an agent. We have started discussion in our team on a problem of designing a computational pipeline for Humanoids as Embodied Autonomous Agentic AI. The current discussion is whether autonomy requests to understand human relationship. Then, when a humanoid is situated in the relationship, does it request the humanoid to have the sense of self? Though our discussion is still primitive, we would like to introduce our discussion.

Bio: Yoshihiko Nakamura is Professor and Chair of Robotics Department, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE. He is Professor Emeritus of the University of Tokyo and CEO of Kinescopic, Inc. He received PhD degree from Kyoto University. He held faculty positions at Kyoto University, University of California, Santa Barbara, and University of Tokyo before joining MBZUAI in 2024. Prof. Nakamura’s fields of research are humanoid robotics, biomechanics, human digital twin and their computational algorithms. He received King-Sun Fu Memorial Best Transactions Paper Award, IEEE Transaction of Robotics and Automation in 2001 and 2002. He is also a recipient of JSME Medal for Distinguished Engineers in 2019, Pioneer Award of IEEE Robotics and Automation Society in 2021, Tateisi Prize Achievement Award in 2022, and the Japan Academy Prize in 2025.

Yoshi Nakamura

Professor and Chair

MBZUAI

Hang Zhao

Assistant Professor

Tsinghua University

Foundation Models for Embodied AI

Abstract: Dr. Hang Zhao will first talk about Galaxea Open-World dataset, the largest robot dataset collected in the open-world environments, and the G0 dual-system VLA models for robot manipulation. Then he will introduce SLAM-former, a spatial foundation model that unifies the front-end tracking and mapping with back-end optimization and loop closure in SLAM.

Bio: Hang Zhao is an Assistant Professor at Tsinghua University, and a co-founder of a robotics startup Galaxea. Dr. Zhao received his Ph.D. from MIT and previously worked as a Research Scientist at Waymo. His honors include a Best System Paper Nomination at CoRL, a Best Paper Award at ICCP, being named one of MIT Technology Review's Innovators Under 35 (TR35), and SAIL Star Award, which is the highest honor at the World AI Conference.

From Video Understanding to Embodied AI

Abstract: Computer vision has recently excelled on a wide range of tasks from image segmentation to detailed image captioning and realistic video generation. This impressive progress now drives the emergence of new industries and yet, current methods still fall short in supporting simple embodied tasks. How to plant a tree, to assemble a chair or to arrange glassware without breaking it? Systems that can resolve such questions from visual inputs in real-world environments are expected to unlock immense potential of robotics and embodied decision making. Following this motivation, in this talk I will address models and learning methods that enable controlled video generation, physically-plausible animation and zero-shot solutions to manipulation tasks.

Bio: Ivan Laptev is a full professor of computer vision at Mohamed bin Zayed University of Artificial Intelligence in Abu Dhabi. Before joining MBZUAI he directed an INRIA research lab in Paris. He holds a PhD degree from the Royal Institute of Technology in Sweden and a HdR degree from Ecole Normale Superieure in France. Ivan's main research interests include physically-grounded models for visual recognition, animation and robotics. He has published over 150 technical papers on computer vision and machine learning. He is an Associate Editor in Chief of IEEE TPAMI and previously served as an Associate Editor of IEEE TPAMI, IJCV and IVC. He also served as a Program Chair for CVPR'18, ICCV'23 and ACCV'24, and will serve as a General Chair for ICCV’29. Beyond his academic career, Ivan has co-founded a computer vision startup that grew beyond 300 employees. In 2017 Ivan was awarded a Helmholtz prize for significant impact on computer vision research.

Ivan Laptev

Professor

MBZUAI

Zewen He

Postdoctoral Associate

MBZUAI

Towards Adaptive Compliance in Humanoid Robots

Abstract: Humanoid robots are gradually moving beyond the laboratory and into factories and even daily life. One of key drivers of this transition is imitation learning from human data. However, because most current human datasets lack measured force data, and learning-based robot control is largely position-based, achieving appropriate compliance during interaction with real environments remains challenging. In this talk, I introduce our ongoing work: Compliant Task Pipeline - a pipeline that leverages compliance information in the learning-based structure of humanoid robots. In current work, a dual-agent reinforcement learning framework combined with model-based compliance control for humanoid robots is proposed. This framework can be integrated with LLMs and VLMs to realize controllable compliance in humanoid robots.

Bio: Dr. Zewen He is a Postdoctoral Researcher in the Department of Robotics at MBZUAI. He received his PhD from the Department of Mechano-informatics, the University of Tokyo in 2024. He has published over 10 peer-reviewed publications, most in top-tier robotics venues including ICRA, IROS, and Humanoids. He is also a reviewer of flagship journals/conference proceedings including IEEE TASE/T-MECH/RA-Letter, ICRA, and IROS.

Humanoid That Care: Toward Safe and Autonomous Assistance

Abstract: Humanoid robots are increasingly expected to operate in environments where precision, safety, and physical interaction are essential. This talk traces a humanoid’s journey from high-precision industrial manipulation to safe and autonomous physical assistance for frail individuals. We begin with a contact-rich manufacturing scenario, in which a humanoid robot operates an aircraft circuit-breaker panel using a task-space control and task-aware posture-planning framework. Building on these foundations, the talk transitions to applications in human care environments, where physical interaction requires not only accuracy but also sensitivity. I present methods for proprioceptive contact detection, compliant motion regulation, and multi-contact planning on the human body. Together, these capabilities enable humanoid robots to perceive contact through their own joints, adapt their motion safely. By following this progression from aircraft manufacturing to care facilities, the talk highlights how a unified control and perception framework can scale from high-precision industrial tasks to gentle, human-centered physical assistance - pointing toward a future in which humanoids can reliably support us across both technical and caregiving domains. Together, these results illuminate a path toward humanoids that truly care - robots that not only understand us, but can reach out and help.

Bio: Anastasia Bolotnikova received her Master’s degree in Computer Science from the University of Tartu, Estonia, in 2017. During this time, she was a recipient of the Skype and IT Academy Master’s Scholarship for excellent academic performance. Her master’s thesis, focused on humanoid robot control in aircraft manufacturing, was recognized with the Best Student Paper Award at the international conference IEEE CASE in 2017. She obtained her Doctorate degree in Robotics from the University of Montpellier, France, in 2021. Her doctoral research, conducted in collaboration with the industrial partner SoftBank Robotics Europe, focused on using humanoid robots for physical assistance to frail individuals. This work was recognized with the SoftBank Robotics Shanghai Innovation Prize at IEEE RO-MAN in 2018 and the L’Oréal–UNESCO For Women in Science Young Talents France Award in 2019. From 2021 to 2024, as part of her postdoctoral research with the RRL and BioRob laboratories, she led the Intelligent Assistive Environment project at the Center for Intelligent Systems (CIS) at EPFL, Lausanne, Switzerland. She is currently a permanent CNRS researcher in the Robotics and InteractionS (RIS) team at the Laboratory for Analysis and Architecture of Systems (LAAS), Toulouse, France.

Anastasia Bolotnikova

Scientist

CNRS, LAAS

Sajjad Hussain

PhD Researcher

University of Brighton

Multimodal Human–Robot Interaction Using Gesture and Language Models

Abstract:This talk presents a unified multimodal framework for intelligent human–robot interaction that combines real-time gesture recognition with large-language-model-driven task understanding. The first part introduces AI-RTGM, an adaptive pipeline integrating continuous gesture perception, trajectory generation, and 7-DoF robotic control. Using a CNN-based vision model and ROS/libfranka, the system achieves below 70 ms latency for smooth and compliant motion replication. The second part focuses on multilingual LLM-based robotic manipulation, demonstrating how English and German instructions can be translated into executable robotic actions for tasks such as coffee and tea preparation with 83% accuracy. Together, these contributions establish a scalable vision-and-language pipeline for intuitive, natural, and adaptive human–robot collaboration.

Bio: Sajjad Hussain is a PhD researcher in Robotics and Artificial Intelligence at the University of Brighton, UK. His research focuses on multimodal human–robot interaction, including real-time gesture mapping, large language models for task generation, and intelligent control of high-degree-of-freedom robotic manipulators.

Humanoids Behaving Like Humans: Social Navigation, Whole-Body Multi-Object Manipulation, and Perceptive Imitative Locomotion

Abstract: Humanoid robots share similar morphology with humans, allowing them to act and interact in human-like ways. In this talk, I will present three behaviors that enable humanoid robots to imitate human decision-making, manipulation, motion, and perception. First, I will introduce a social navigation framework that allows humanoid robots to make decisions based on human emotions and social interactions. Next, to explore the potential of whole-body manipulation, I will present a framework that enables humanoid robots to carry multiple objects simultaneously using their entire body. Finally, I will discuss our recent work on perceptive locomotion integrated with imitation learning, which allows humanoid robots to traverse rough terrains with more natural, human-like gaits.

Bio: Wei Zhu is an Assistant Professor in the Department of Robotics, School of Engineering, Tohoku University. He previously served as a Postdoctoral Fellow at the Georgia Institute of Technology. He earned his Ph.D. from Tohoku University and received his Master’s and Bachelor’s degrees from Nankai University. His research focuses on Embodied AI, with particular interests in social navigation for legged robots and humanoid loco-manipulation.

Wei Zhu

Assistant Professor

Tohoku University

Page updated

Google Sites

Report abuse