Machine Learning Reading Group at UArizona
Welcome to our Machine Learning Reading Group (MLRG) at the University of Arizona, organized by Kwang-Sung Jun, Jason Pacheco, and Chicheng Zhang. We have a weekly meeting starting from Fall 2019. The goal is to pick a specific topic to focus each semester, read papers together, and hopefully apply them to your research, collaborate, and write papers together. By the end of the semester, we at least understand the fundamental concepts and challenges of the topic, what people have proposed so far, and what open problems are out there!
mailing list: https://list.arizona.edu/sympa/info/mlrg
Fall 2020: Reinforcement learning and their friends
Time: 4:30pm-6pm Thursdays
Location: zoom (see the mailing list for the address)
We will cover topics in reinforcement learning (RL) and the others including imitation learning, apprenticeship learning, and inverse RL.
09/24/2020
the first meeting of the semester
discussions
Cheat sheet summarizing various RL settings and algorithms
A common place for code snippet and cheat sheet: github repo? or google drive folder?
github repo can have a landing page -- this can replace our google sites; can also serve as the one place that has everything
google drive might be better for binaries.
RL frameworks: from OpenAI and UC Berkeley
adaptive experiment design, dynamic treatment
Fall 2019 - Spring 2020: Imitation Learning
Time: 3-4:15pm Fridays
Location: Gould-Simpson 856
To get started on imitation learning, we can read Hal Daume III's book chapter, and MDP basics (e.g. Sutton and Barto, Chapter 3). This ICML workshop has many useful resources: https://sites.google.com/view/icml2018-imitation-learning/
Here is a list of papers on imitation learning that Chicheng collected; you are welcome to find more papers of your interest (both theory and applications) and send them to us.
Interactive imitation learning:
Hal Daumé III, John Langford, Daniel Marcu. Search-based Structured Prediction. Machine Learning Journal 2009.
Stephane Ross, Geoffrey J. Gordon, J. Andrew Bagnell. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning. AISTATS 2011.
Stephane Ross, J. Andrew Bagnell. Reinforcement and Imitation Learning via Interactive No-Regret Learning. NIPS 2014.
Wen Sun, Arun Venkatraman, Geoffrey J. Gordon, Byron Boots, J. Andrew Bagnell. Deeply AggreVaTeD: differentiable imitation learning for sequential prediction. ICML 2017.
Ching-An Cheng and Byron Boots. Convergence of Value Aggregation for Imitation Learning. NIPS 2017.
Wen Sun, Anirudh Vemula, Byron Boots, J. Andrew Bagnell. Provably Efficient Imitation Learning from Observation Alone. ICML 2019.
Inverse reinforcement learning:
Andrew Y. Ng and Stuart Russell. Algorithms for Inverse Reinforcement Learning. ICML 2000.
Brian D. Ziebart, Andrew Maas, J.Andrew Bagnell, Anind K. Dey. Maximum Entropy Inverse Reinforcement Learning. AAAI 2008.
Brian D. Ziebart, J.Andrew Bagnell, Anind K. Dey. Modeling Interaction via the Principle of Maximum Causal Entropy. ICML 2010.
Jonathan Ho, Stefano Ermon. Generative Adversarial Imitation Learning. NIPS 2016.
Kareem Amin, Nan Jiang, Satinder Singh. Repeated Inverse Reinforcement Learning. NIPS 2017.
Apprenticeship learning:
Pieter Abbeel and Andrew Y. Ng. Apprenticeship Learning via Inverse Reinforcement Learning. ICML 2004.
Umar Syed and Robert E. Schapire. A Game-Theoretic Approach to Apprenticeship Learning. NIPS 2007.
Umar Syed, Michael H. Bowling, Robert E. Schapire. Apprenticeship Learning Using Linear Programming. ICML 2008.
Alekh Agarwal, Ashwinkumar Badanidiyuru, Miroslav Dudik, Robert Schapire, Aleksandrs Slivkins, Miro Dudík. Robust Multi-objective Learning with Mentor Feedback. COLT 2014.
Schedule
(confirmed future speaker: Tianchi, Kwang (online learning), Jason&Reza (probabilistic interpretation of RL), Chicheng (after June), Helen's group or people in Stat/Math GIDP)
05/15/2020
Kwang will present online learning and its relation to imitation learning, part 2.
05/08/2020
Kwang presented online learning and its relation to imitation learning, part 1.
04/31/2020
Tianchi presented Sun et al., Provably Efficient Imitation Learning from Observation Alone.
04/24/2020
Priyamvadha discussed Cheng & Boots, 2017.
04/10/2020 (no meeting)
Skipped and attended Mihai's talk.
04/03/2020
Reyan discussed policy gradient methods based on this slides. (see also the video series)
03/27/2020
Ryn discussed the Deeply AggreVaTeD paper.
03/06/2020
Clayton Morrison gave a lecture on the reinforcement learning basics (2/2).
02/28/2020
Clayton Morrison gave a lecture on the reinforcement learning basics (1/2).
02/21/2020
Stephane Ross, J. Andrew Bagnell. Reinforcement and Imitation Learning via Interactive No-Regret Learning. NIPS 2014.
Presenter: Jason Pacheco
02/14/2020
Iqbal did not make it today. Reza covered Ross'11 very closely.
02/07/2020
Iqbal covered Ross'11. However, there was a lingering question on how the theoretical guarantee works there.
Next time, Iqbal will discuss his research problem, and then Reza will lead the discussion of the theoretical guarantees of Ross'11.
01/31/2020
Iqbal covered Ross'11 up to section 2.
12/06/2019
Chicheng finished the discussion of the basic MDP, covering how to find the optimal policy.
Iqbal presented the setup of imitation learning (Ross et al., 2011)
This is the last meeting of the semester.
11/22/2019
Chicheng led the discussion on the basic MDP, based on Sutton & Barto book chapter 3.
11/15/2019 (no meeting)
11/08/2019
From now on, we meet 9am-10:15am at GS 906.
Continued watching the tutorial video by Yisong Yue and had discussions along the way, watched up to 1h 10m 26s.
11/01/2019
As an easy start, we watched the tutorial video by Yisong Yue and had discussions along the way.
We watched upto 32m 29s.
10/25/2019
The first meeting.
Possible topics discussed: (1) Monte Carlo tree search (2) imitation learning (3) sequential information maximization (4) Bayesian sparse structural learning (5) Bayesian optimization (6) Bayesian deep learning
We eventually chose imitation learning by voting.
homework: read the tutorial by Daume III