Max Yang, Yijiong Lin, Alex Church, John Lloyd,
Dandan Zhang, David A.W. Barton*, Nathan F. Lepora*
Department of Engineering Mathematics and Bristol Robotics Laboratory, University of Bristol, Bristol BS8 1UB, U.K.
email: {max.yang, david.barton, n.lepora}@bristol.ac.uk
Object pushing presents a key non-prehensile manipulation problem that is illustrative of more complex robotic manipulation tasks. Characterized by a partially observable system with difficult-to-model physics, developing a system for general object pushing remains an unsolved challenge. While deep reinforcement learning (RL) methods have demonstrated impressive learning capabilities using visual input, a lack of tactile sensing limits their capability for fine control during manipulation. Here we propose a deep RL approach to object pushing using tactile and without visual input, namely tactile pushing. We present a goal-conditioned formulation that allows both model-free and model-based RL to obtain accurate policies for pushing an object to a goal. To achieve real-world performance, we adopt a sim-to-real approach. Our results demonstrate that it is possible to train on a single object and a limited sample of goals to produce precise and reliable policies that can generalize to a variety of unseen scenarios without domain randomization. We experiment with the trained agents in harsh pushing conditions, and show that with significantly more training samples, a model-free policy can outperform a model-based planner, generating shorter and more reliable pushing trajectories despite large disturbances. The simplicity of our training environment and effective real-world performance highlights the value of rich tactile information for fine manipulation.
Simulation Experiments
We train the RL agents to push a cube, on a limited set of goals, and we test the final policy on a range of novel objects and goals not seen during training. The success rate is shown in the table for each agent.
We train different observation models to bridge the sim-to-real gap for each type of tactile observation.