CV-MPC

Dynamic Non-Prehensile Object Transport via
Model-Predictive Reinforcement Learning
Neel Jawale, Byron Boots, Balakumar Sundaralingam, Mohak Bhardwaj

Code

Coming soon

We investigate the problem of teaching a robot manipulator to perform dynamic non-prehensile object transport, also known as the ‘robot waiter’ task, from a limited set of real-world demonstrations. We propose an approach that combines batch reinforcement learning (RL) with model-predictive control (MPC) by pretraining an ensemble of value functions from demonstration data, and utilizing them online within an uncertainty-aware MPC scheme to ensure robustness to limited data coverage. Our approach is straightforward to integrate with off-the-shelf MPC frameworks and enables learning solely from task space demonstrations with sparsely labeled transitions, while leveraging MPC to ensure smooth joint space motions and constraint satisfaction. We validate the proposed approach through extensive simulated and real-world experiments on a Franka Panda robot performing the robot waiter task and demonstrate robust deployment of value functions learned from 50-100 demonstrations. Furthermore, our approach enables generalization to novel objects not seen during training and can improve upon suboptimal demonstrations. We believe that such a framework can reduce the burden of providing extensive demonstrations and facilitate rapid training of robot manipulators to perform non-prehensile manipulation tasks.

Real World Videos

Against Gravity - The robot has to reach locations in space that go against gravity while balancing objects on the tray by keeping it stable

Towards Gravity - The robot must carefully reach the positions in space that are downward, moving in the direction of gravity.

Difficult to reach poses - The robot must reach locations that are laterally distant and can be either against or towards gravity, increasing the overall difficulty of the task

Lateral Reaching - The robot must transport the object on the tray from one end of the workspace to the other laterally, pushing it to its limit.

Failure Videos
These videos demonstrate that the objects slide on the tray, indicating that the static friction in these experiments is not significant. Consequently, the objects are prone to slipping if the constraints are not satisfied or if the motion is slightly jerky.

MPC Demonstrator

While our approach does not make any assumptions about the source of demonstrations, in this work we use an algorithmic demonstrator because it enables us to easily collect data for different ablation studies. We employ a Model Predictive Control (MPC)-based demonstrator that uses friction cone constraints to enforce object stability, building on the formulation described in [1].

We model the object as a rigid body that adheres to the Newton-Euler equation, which states that the sum of the gravitoinertial wrench and the contact wrench equals zero. The gravitoinertial wrench represents the combined effects of gravity and inertia on the object, while the contact wrench represents the forces and torques exerted on the object due to contact with the environment.Our goal is to create an expert that implicitly satisfies the Newton-Euler equation and the friction cone constraints to prevent the object from sliding significantly while reaching the target goal. For this, we assume access only to the robot states and nominal values for the object's friction and inertial properties.We utilize these dynamic equations to formulate a cost function that penalizes end-effector states violating the constraints, while assigning zero cost elsewhere. The gravitoinertial wrench is computed based on the object's mass, inertia, velocities, and accelerations, taking gravity into account. By assuming the object moves minimally on the tray, we approximate its orientation to be similar to that of the end effector. This allows us to rewrite the Newton-Euler equations in the end-effector frame.

To ensure the object does not slip on the tray surface, we require the gravitoinertial wrench to be balanced by the contact wrench in the object's body frame. To model the wrench resulting from the object's contact with the tray, we calculate forces at predefined contact points on the object's surface using a point contact with friction model as used in [2]. We consider a set of n contact points, and for each point, we define a force vector consisting of tangential and normal force components.We relate the stacked contact forces to the gravitoinertial wrench using the inverse of the grasp matrix. The grasp matrix encapsulates the relationship between individual contact forces and the resultant wrench acting on the object. It is constructed using adjoint transformation matrices that relate each contact point to the object frame, along with basis matrices that project the transmissible components of the contact forces into a six-dimensional space. This relationship enables us to compute the contact forces resulting from end-effector motions.

To prevent slipping, the contact forces at each contact point must satisfy the friction cone constraints. These constraints specify that the magnitude of the tangential forces must be less than or equal to the product of the friction coefficient and the normal force component. Additionally, the normal force must be non-negative to ensure contact is maintained. To optimize robot trajectories that satisfy these constraints, we formulate a cost function that penalizes any violations of the friction cone constraints. This cost function is zero when the constraints are satisfied and increases when they are not. In our implementation, we integrate this cost into the running cost within the STORM framework, allowing us to collect demonstrations for various experiments. Importantly, while the algorithmic demonstrator assumes access to the object's inertial and friction properties, the learned value function does not. In our experiments, we also investigate how learning can improve upon suboptimal demonstrations when the nominal object properties are incorrect.

[1] A. Heins and A. P. Schoellig, “Keep it upright: Model predictive control for nonprehensile object transportation with obstacle avoidance on a mobile manipulator,” IEEE Robotics and Automation Letters, 2023.

[2] M. Selvaggio, J. Cacace, C. Pacchierotti, F. Ruggiero, and P. R. Gior- dano, “A shared-control teleoperation architecture for nonprehensile object transportation,” IEEE Transactions on Robotics, 2022

Page updated

Google Sites

Report abuse