Contrastive Variational Reinforcement Learning for Complex Observations

Xiao Ma, Siwei Chen, David Hsu, Wee Sun Lee

National University of Singapore

Abstract

Deep reinforcement learning (DRL) has achieved significant success in various robot tasks: manipulation, navigation, etc. However, complex visual observations in natural environments remains a major challenge. This paper presents Contrastive Variational Reinforcement Learning (CVRL), a model-based method that tackles complex visual observations in DRL. CVRL learns a contrastive variational model by maximizing the mutual information between latent states and observations discriminatively, through contrastive learning. It avoids modeling the complex observation space unnecessarily, as the commonly used generative observation model often does, and is significantly more robust. CVRL achieves comparable performance with state-of-the-art model-based DRL methods on standard Mujoco tasks. It significantly outperforms them on Natural Mujoco tasks and a robot box-pushing task with complex observations, e.g., dynamic shadows.

Paper

Xiao Ma, Siwei Chen, David Hsu, Wee Sun Lee

Contrastive Variational Model-Based Reinforcement Learning for Complex Observations

In Proceedings of 4th Conference on Robot Learning (CoRL), 2020 [PDF][Code]

@inproceedings{ma2020contrastive,

author = {Xiao Ma and Siwei Chen and David Hsu and Wee Sun Lee},

booktitle={Proceedings of the 4th Conference on Robot Learning},

title = {Contrastive Variational Reinforcement Learning for Complex Observations},

year = {2020}

}

Model Overview


Experiment Results

Natural Mujoco Visualizations

Natural Walker Walk

Natural Quadruped Walk

Natural Pendulum Swingup

Natural Cup Catch

Natural Cartpole Balance

Natural Finger Spin

Box Pushing Visualizations

Talk

Related Works

Xiao Ma, Peter Karkus, David Hsu, Wee Sun Lee, Nan Ye

Discriminative Particle Filter Reinforcement Learning for Complex Partial Observations

ICLR 2020, [PDF]

Danijar Hafter, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi

Dream to Control: Learning Behaviors by Latent Imagination

ICLR 2020, [PDF]