Equivariant Reinforcement Learning under Partial Observability

Hai Nguyen, Andrea Baisero, David Klee, Dian Wang, Robert Platt, Christopher Amato

Khoury College of Computer Sciences,   Northeastern University, Boston, MA, United States

Abstract

Incorporating inductive biases is a promising approach for tackling challenging robot learning domains with sample-efficient solutions. This paper identifies partially observable domains where symmetries can be a useful inductive bias for efficient learning. Specifically, by encoding the equivariance regarding specific group symmetries into the neural networks, our actor-critic reinforcement learning agents can reuse solutions in the past for related scenarios. Consequently, our equivariant agents outperform non-equivariant approaches significantly in terms of sample efficiency and final performance, demonstrated through experiments on a range of robotic tasks in simulation and real hardware.

OpenReview  Code

Learned Policies in Simulation

policy_block_picking-(2).mp4

Block-Picking

Out of two same blocks, the agent must pick the only movable block

policy_block_pulling-(2).mp4

Block-Pulling

Out of two same blocks, the agent must pull the only movable block so that the two blocks are in contact

policy_block_pushing-(2).mp4

Block-Pushing

Out of two same blocks, the agent must push the only movable block to a goal pad

policy_drawer_opening-(2).mp4

Drawer-Opening

Out of two same drawers, the agent must open the only unlocked drawer

policy_symm_carflag_1d.mp4

CarFlag-1D

The car must go to the green flag, which can be on either the leftmost or rightmost. Only when it is at the blue flag, the car can observe the side of the green flag

policy_symm_carflag_2d.mp4

CarFlag-2D

The agent (red) must go to the green cell. Only when the agent is inside the blue region, it can observe the coordinate of the goal cell

Zero-shot Transfer for Robot Domains

corl23_real_picking.mp4

Block-Picking

Out of two same blocks, the agent must pick the only movable block

corl23_real_pulling.mp4

Block-Pulling

Out of two same blocks, the agent must pull the only movable block so that the two blocks are in contact

corl23_real_pushing.mp4

Block-Pushing

Out of two same blocks, the agent must push the only movable block to a goal pad

corl23_real_opening.mp4

Drawer-Opening

Out of two same drawers, the agent must open the only unlocked drawer