Week 11 Model-based vs Model-free RL

Core Readings

Thanard Kurutach, Ignasi Clavera, Yan Duan, Aviv Tamar, Pieter Abbeel, Model-Ensemble Trust-Region Policy Optimization, ICLR 2018

Ignasi Clavera*, Jonas Rothfuss*, John Schulman, Yasuhiro Fujita, Tamim Asfour, Pieter Abbeel, Model-Based Reinforcement Learning via Meta-Policy Optimization, CoRL 2018

Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine, When to Trust Your Model: Model-Based Policy Optimization, NeurIPS 2019

Extended Readings

Julian Schrittweiser, et al, MuZero: Mastering Atari, Go, chess and shogi by planning with a learned model, Nature 2020

Weirui Ye, Shaohuai Liu, Thanard Kurutach, Pieter Abbeel, Yang Gao, Mastering Atari Games with Limited Data, NeurIPS 2021

Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi, Dream to Control: Learning Behaviors by Latent Imagination, ICLR 2020

Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba , Mastering Atari with Discrete World Models, ICLR 2021

Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski, Model-Based Reinforcement Learning for Atari, ICLR 2020

A. Rajeswaran, S. Ghotra, S. Levine, B. Ravindran, EPOpt: Learning robust neural network policies using model ensembles, ICLR 2017

Marc Deisenroth, Carl Rasmussen, PILCO: A model-based and data-efficient approach to policy search, ICML 2011

Nicolas Heess, Greg Wayne, David Silver, Timothy Lillicrap, Yuval Tassa, Tom Erez, Learning Continuous Control Policies by Stochastic Value Gradients, NeurIPS 2015