Towards Generalization and Simplicity in Continuous Control


Aravind Rajeswaran* Kendall Lowrey* Emanuel Todorov Sham Kakade

In Advances in Neural Information Processing Systems (NIPS) 2017

This work shows that policies with simple linear and RBF parameterizations can be trained to solve a variety of continuous control tasks, including the OpenAI gym benchmarks. The performance of these trained policies are competitive with state of the art results, obtained with more elaborate parameterizations such as fully connected neural networks. Furthermore, existing training and testing scenarios are shown to be very limited and prone to over-fitting, thus giving rise to only trajectory-centric policies. Training with a diverse initial state distribution is shown to produce more global policies with better generalization. This allows for interactive control scenarios where the system recovers from large on-line perturbations; as shown in the supplementary video.

Bibliography

@INPROCEEDINGS{Rajeswaran-NIPS-17,
    AUTHOR    = {Aravind Rajeswaran and Kendall Lowrey and Emanuel Todorov and Sham Kakade},
    TITLE     = "{Towards Generalization and Simplicity in Continuous Control}",
    BOOKTITLE = {NIPS},
    YEAR      = {2017},
}