Teaching

Introductory Course on Reinforcement Learning (8 lectures - 2011/12)


Part of the Data Mining and Machine Learning (DMML) course this 8 lecture series introduces Reinforcement Learning. It covers the basics of Makovian states, V* values, Q functions and Q-learning based on Chapter 13 of Machine Learning by Tom M. Mitchell (1997 edition, publishers: McGraw-Hill). It also draws on material from Reinforcement Learning an Introduction by Richard S. Sutton and Andrew G. Barto (1998, MIT Press), specifically n-step TD updates, eligibility traces, function approximation, direct policy search and Actor-Critic methods.

The RL assignment (Assignment 2) is now available here.

I strongly suggest reading Sections 13.1 to 13.4 of Machine Learning and the introductory sections (Sections 1.1-1.3) of Reinforcement Learning an Introduction.

Copies of the 2011-12 slides (uploaded after each lecture) and some demonstration videos can be found below. For those who missed my introduction to confidence intervals the following YouTube video may be of help. http://www.youtube.com/watch?v=Hn6C21GC0vA

  • ch13-RL-2011-lecture8.pdf   1156k - 11 Nov 2011 09:02 by Paul Crook (v1)
    ‎Introduction to RL - Lecture 8 (last) in 2011/12 series.‎
  • ch13-RL-2011-lecture7.pdf   276k - 8 Nov 2011 02:55 by Paul Crook (v1)
    ‎Introduction to RL - Lecture 7 in 2011/12 series. Includes sketch on equivalence of SARSA(lambda) and "forwards view" of averaging n-step TD updates.‎
  • ch13-RL-2011-lecture6.pdf   459k - 28 Oct 2011 13:45 by Paul Crook (v1)
    ‎Introduction to RL - Lecture 6 in 2011/12 series.‎
  • ch13-RL-2011-lecture5.pdf   74k - 14 Oct 2011 07:19 by Paul Crook (v1)
    ‎Introduction to RL - Lecture 5 in 2011/12 series.‎
  • ch13-RL-2011-lecture4.pdf   481k - 7 Oct 2011 12:48 by Paul Crook (v1)
    ‎Introduction to RL - Lecture 4 in 2011/12 series‎
  • ch13-RL-2011-lecture3.pdf   416k - 30 Sep 2011 07:08 by Paul Crook (v2)
    ‎Introduction to RL - Lecture 3 in 2011/12 series. ‎
  • ch13-RL-2011-lecture2.pdf   568k - 30 Sep 2011 07:08 by Paul Crook (v2)
    ‎Introduction to RL - Lecture 2 in 2011/12 series. ‎
  • ch13-RL-2011-lecture1.pdf   916k - 30 Sep 2011 07:08 by Paul Crook (v2)
    ‎Introduction to RL - Lecture 1 in 2011/12 series. ‎
  • MPI_Promo_RL_snippet.avi   6092k - 19 Jan 2011 08:32 by Paul Crook (v1)
    ‎RL of motor skills (Max Planck Inst. Robot Learning Lab)‎
  • RL-Baseball.mov   15443k - 19 Jan 2011 08:25 by Paul Crook (v1)
    ‎RL of motor skills (Jan Peters & Stefan Schaal)‎
  • dipper-pomdp-before-training.flv   11925k - 19 Jan 2011 08:00 by Paul Crook (v1)
    ‎Spoken dialogue management (Heriot-Watt Uni. Edinburgh)‎
  • dipper-pomdp-after-training.flv   9436k - 19 Jan 2011 07:59 by Paul Crook (v1)
    ‎Spoken dialogue management (Heriot-Watt Uni. Edinburgh)‎
  • before_learning_far_view1.flv   1494k - 19 Jan 2011 07:58 by Paul Crook (v1)
    ‎Bipedal locomotion (ATR Japan) ‎
  • after_learning_far_view_metal1.flv   4139k - 19 Jan 2011 07:58 by Paul Crook (v1)
    ‎Bipedal locomotion (ATR, Japan) ‎
Showing 14 files from page attachments.