Diffusion Co-Policy for Synergistic Human-Robot Collaborative Tasks

Abstract

Modeling multimodal human behavior has been a key barrier to increasing the level of interaction between human and robot, particularly for collaborative tasks. Our key insight is that an effective, learned robot policy used for human-robot collaborative tasks must be able to express a high degree of multimodality, predict actions in a temporally consistent manner, and recognize a wide range of frequencies of human actions in order to seamlessly integrate with a human in the control loop. We present Diffusion Co-policy, a method for planning sequences of robot actions that synergize well with humans during test time. The co-policy is a denoising diffusion probabilistic model with a Transformer-based decoder trained on a dataset of collaborative human-human demonstrations. We demonstrate in both simulation and real environments that the method outperforms other state-of-art learning methods on the task of human-robot table-carrying with a human in the loop. Moreover, we qualitatively highlight compelling robot behaviors that demonstrate evidence of true humanrobot collaboration, including mutual adaptation, shared task understanding, leadership switching, learned partner behaviors, and low levels of wasteful interaction forces arising from dissent.