Deep RL Bootcamp

26-27 August 2017 | Berkeley CA


Prelab instructions: Set up your computer for all labs.

Lab 1: Markov Decision Processes. You will implement value iteration, policy iteration, and tabular Q-learning and apply these algorithms to simple environments including tabular maze navigation (FrozenLake) and controlling a simple crawler robot.

Lab 2: Introduction to Chainer. You will implement deep supervised learning using Chainer, and apply it to the MNIST dataset.

Lab 3: Deep Q-Learning. You will implement the DQN algorithm and apply it to Atari games.

Lab 4: Policy Optimization Algorithms. You will implement various policy optimization algorithms, including policy gradient, natural policy gradient, trust-region policy optimization (TRPO), and asynchronous advantage actor-critic (A3C). You will apply these algorithms to classic control tasks, Atari games, and roboschool locomotion environments.

The labs are released under an MIT license, and the license file can be found in the downloaded folder for each lab,.