References

Search-based structured prediction. Daume, Langford & Marcu, Machine Learning 2009

A reduction of imitation learning and structured prediction to no-regret online learning. Ross, Gordon & Bagnell, AISTATS 2011

Efficient Reductions for Imitation Learning. Ross & Bagnell, AISTATS 2010

A reduction from apprenticeship learning to classification. Syed & Schapire, NIPS 2010

Apprenticeship learning via inverse reinforcement learning. Abbeel & Ng, ICML 2004

A Game-Theoretic Approach to Apprenticeship Learning. Syed & Schapire, NIPS 2007

Apprenticeship learning using linear programming. Syed, Bowling & Schapire, ICML 2008

Maximum entropy inverse reinforcement learning. Ziebart, Mass, Bagnell & Dey, AAAI 2008

Generative adversarial imitation learning. Ho & Ermon, NIPS 2016

Guided cost learning: deep inverse optimal control via policy optimization. Finn, Levine & Abbeel, ICML 2016

Learning for Control from Multiple Demonstrations. Adam Coates et al., ICML 2008

An Application of Reinforcement Learning to Aerobatic Helicopter Flight. Pieter Abbeel et al., NIPS 2006

Planning-based Prediction for Pedestrians. Brian Ziebart et al., IROS 2009

Data-Driven Ghosting using Deep Imitation Learning. L

A Deep Learning Approach for Generalized Speech Animation. Taylor et al., SIGGRAPH 2017

Learning Policies for Contextual Submodular Optimization. Ross et al., ICML 2013

Learning to Search in Branch and Bound Algorithms. He et al., NIPS 2014

Learning to Search via Retrospective Imitation. Song et al., arxiv 2018

Learning to Search Better than Your Teacher. Chang et al., ICML 2015

Reinforcement and imitation learning via interactive no-regret learning. Ross et al., arxiv 2014

Residual Loss Prediction: Reinforcement Learning With No Incremental Feedback. Daume et al., ICLR 2018

Truncated Horizon Policy Search: Combining Reinforcement Learning and Imitation Learning. Sun et al., ICLR 2018

Sequence Level Training with Recurrent Neural Networks. Ranzato et al., ICLR 2016

Smooth Imitation Learning for Online Sequence Prediction. Le et al., ICML 2016

Programmatically Interpretable Reinforcement Learning. Verma et al., ICML 2018

Safe Imitation Learning for Autonomous Driving. Zhang & Cho, AAAI 2017

One Shot Imitation Learning. Duan et al., NIPS 2017

One-Shot Visual Imitation Learning via Meta-Learning. Finn et al., CoRL 2017

One-Shot Imitation from Watching Videos. Yu & Finn., RSS 2018

Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World. Tobin et al., IROS 2017

Repeated Inverse Reinforcement Learning. Amin et al., NIPS 2017

Coordinated Multi-Agent Imitation Learning. Le et al., ICML 2017

Generative Multi-agent Behavioral Cloning. Zhan et al., arxiv 2018

infoGAIL: Interpretable Imitation Learning from Visual Demonstrations. Li et al., NIPS ‘17

Cooperative Inverse Reinforcement Learning. Hadfield-Menell et al., NIPS 2016

An Efficient, Generalized Bellman Update for Cooperative Inverse Reinforcement Learning. Malik et al., ICML 2018

Showing vs Doing: Teaching by Demonstration. Ho et al., NIPS 2016

Hierarchical Imitation and Reinforcement Learning. Le et al., ICML 2018

Learning from Human Preferences. Christiano et al., NIPS 2017

Programming by Feedback. Akrour et al., ICML 2014

A Bayesian Approach for Policy Learning from Trajectory Preference Queries. Wilson et al., NIPS 2012

Learning Trajectory Preferences for Manipulators via Iterative Improvement. Jain et al., NIPS 2013

Interactive Learning from Policy-Dependent Human Feedback (COACH). MacGlashan et al., ICML 2017

Interactively shaping agents via human reinforcement: the TAMER framework. Knox & Stone, K-CAP 2009