Algorithms for Imitation Learning

Organizer: Machine Learning and Robotics (MLR) Lab (webpage)
Lecturer: Dr. Jim Mainprice (webpage)
Examiner: Prof. Dr. Marc Toussaint

Students will present 20 min + 10 min questions (3 presentations per week) starting on June 17th, presentation dates will be uploaded on Friday, May 10th. Presence is mandatory from all participants, and controlled. Quality of the questions will be part of the grade. Deadline for hand-in of the Python-Notebook will be at the end of the lecture period.

Introductory lectures given now in Room 38.03 (we used to meet in 0.457):

Mon Apr 8th, 17:30 : Imitation Learning (pdf)
Mon Apr 15th, 17:30 : Behavior Cloning (pdf)
Mon Apr 29th, 17:30 : Inverse Reinforcement Learning (pdf)

It is encouraged to take a look at the following survey paper as the seminar will largely be based on it: https://arxiv.org/abs/1811.06711

@article{osa2018algorithmic,

  title={An algorithmic perspective on imitation learning},

  author={Osa, Takayuki and Pajarinen, Joni and Neumann, Gerhard and Bagnell, J Andrew and Abbeel, Pieter and Peters, Jan},

  journal={Foundations and Trends{\textregistered} in Robotics},

  volume={7},

  number={1-2},

  pages={1--179},

  year={2018},

  publisher={Now Publishers, Inc.}

  Paper Assignment

       (Student)     -> PaperID

       (Erdenezul)   -> 3

       (Cihan)       -> 12

       (Jan)         -> 6

       (Ralf)        -> 5

       (Katharina)   -> 13

       (Carola)      -> 7

       (Maximilian)  -> 8

       (Domas)       -> 2

       (Peter)       -> 4

       (Josua)       -> 9

       (Marc)        -> 10

       (Jiayao)      -> 11

  Presentations Dates:

       (Student)    -> Date  : (Paper)

   1 - (Marc)       -> 17/06 : (10) Calinon-10

   2 - (Erdenezul)  -> 17/06 : (3)  Pomerleau-89

   3 - (Jan)        -> 17/06 : (6)  Paraschos-13

   4 - (Carola)     -> 24/06 : (7)  Ratliff-06

   5 - (Domas)      -> 24/06 : (2)  Doerr-15

   6 - (Maximilian) -> 24/06 : (8)  Ziebart-08

   7 - (Katharina)  -> 01/07 : (13) Syed-08

   8 - (Jiayao)     -> 01/07 : (11) Mombaur-10

   9 - (Josua)      -> 01/07 : (9)  Boularias-11

  10 - (Peter)      -> 08/07 : (4)  Schaal-98

  11 - (Ralf)       -> 08/07 : (5)  Ijspeert-02

  12 - (Cihan)      -> 08/07 : (12) Levine-11

The following papers will be disussed

1 - @article{Englert:2017jv,

author = {Englert, Peter and Vien, Ngo Anh and Toussaint, Marc},

title = {{Inverse KKT: Learning cost functions of manipulation tasks from demonstrations}},

journal = {The International Journal of Robotics Research},

year = {2017}

exercise = {Implement a version without constraints of the algorithm on synthetic or real dataset}

2 - @article{Doerr2015,

author = {Doerr, Andreas and Ratliff, Nathan D and Bohg, Jeannette and Toussaint, Marc and Schaal, Stefan},

title = {{Direct Loss Minimization Inverse Optimal Control}},

year = {2015},

exercise = {Use an implementation of CMA-ES to solve a toy problem in grid world}

3 - @inproceedings{pomerleau1989alvinn,

  title={Alvinn: An autonomous land vehicle in a neural network},

  author={Pomerleau, Dean A},

  booktitle={Advances in neural information processing systems},

  year={1989},

  exercise = {Implement a version of the network in Tensorflow train an synthetic or real data}

4 - @article{schaal1998constructive,

  title={Constructive incremental learning from only local information},

  author={Schaal, Stefan and Atkeson, Christopher G},

  journal={Neural computation},

  year={1998},

  exercise = {Implement a version of the algorithm using the OpenAI Gym pole balancing environment}

5 - @inproceedings{ijspeert2002movement,

  title={Movement imitation with nonlinear dynamical systems in humanoid robots},

  author={Ijspeert, Auke Jan and Nakanishi, Jun and Schaal, Stefan},

  booktitle={Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292)},

  year={2002},

  exercise = {Implement a version of the algorithm using the OpenAI Gym pole balancing environment}

6 - @inproceedings{paraschos2013probabilistic,

  title={Probabilistic movement primitives},

  author={Paraschos, Alexandros and Daniel, Christian and Peters, Jan R and Neumann, Gerhard},

  booktitle={Advances in neural information processing systems},

  year={2013},

  exercise = {Implement a version of the algorithm using the OpenAI Gym pole balancing environment}

7 - @inproceedings{ratliff2006maximum,

  title={Maximum margin planning},

  author={Ratliff, Nathan D and Bagnell, J Andrew and Zinkevich, Martin A},

  booktitle={Proceedings of the 23rd international conference on Machine learning},

  year={2006},

  exercise = {Implement a version of the algorithm on a grid world}

8 - @inproceedings{ziebart2008maximum,

  title={Maximum entropy inverse reinforcement learning},

  author={Ziebart, Brian D and Maas, Andrew L and Bagnell, J Andrew and Dey, Anind K},

  year={2008},

  exercise = {Implement a version of the algorithm on a grid world}

9 - @article{Boularias:2011wp,

author = {Boularias, A and Kober, J and Peters, J R},

title = {{Relative entropy inverse reinforcement learning}},

journal = {International {\ldots}},

year = {2011},

exercise = {Implement a version of the algorithm on a grid world}

10 - @article{calinon2010learning,

  title={Learning and reproduction of gestures by imitation},

  author={Calinon, Sylvain and D'halluin, Florent and Sauser, Eric L and Caldwell, Darwin G and Billard, Aude G},

  journal={IEEE Robotics \& Automation Magazine},

  year={2010},

  exercise = {Implement a version of the algorithm on a sythetic or real dataset}

11 - @article{Mombaur:2010hg,

author = {Mombaur, Katja and Truong, Anh and Laumond, Jean-Paul},

title = {{From human to humanoid locomotion an inverse optimal control approach}},

journal = {Autonomous Robots},

year = {2010},

exercise = {Implement a version of the algorithm on a synthetic or real dataset}

12 - @article{Levine:2011,

author = {Levine, Sergey},

title = {{Nonlinear Inverse Reinforcement Learning with Gaussian Processes}},

year = {2011},

exercise = {Implement a version of the algorithm on grid world}

13 - @book{Syed:2008fq,

author = {Syed, Umar and Bowling, Michael and Schapire, Robert E},

title = {{Apprenticeship learning using linear programming}},

publisher = {ACM},

year = {2008},

exercise = {Implement a version of the algorithm on grid world}