Reinforcement Learning
Channel: #reinforcement-learning
Co-leads:
Thang - @Thang on Discord, @ThangChu77 on Twitter
Rahul - @RaNa on Discord
Gusti - @Gusti Winata on Discord
Rassem - @Rassem on Discord
Goal:
We're initially going to build a robust foundation of Reinforcement Learning by studying one chapter of Sutton and Barto per week.
We aim to cover different levels of Reinforcement Learning, from introduction to advanced level, with both theoretical and practical perspective.
Create a space for constructive and engaging discussions with the aim to deepen the understand of Reinforcement Learning.
Logistics:
Anyone can present!
Pre-requisite: Preferably read the material ahead at least one week, but we will cover the prerequisite at the beginning.
Meeting: Introduction (10 mins), Main material (30-40 mins), Discussions (10-15 mins)
Occurrences: Bi-weekly, Saturdays 8:30 PM (GMT+7)
Recordings of Recent Sessions
May 18, 2024
May 11, 2024
April 13, 2024
April 6, 2024
Thang and Raseem lead a discussion on "Policy Gradient Methods for Reinforcement Learning with Function Approximation" (https://proceedings.neurips.cc/paper/1999/file/464d828b85b0bed98e80ade0a5c43b0f-Paper.pdf)
RL Foundations Study Group - "Lecture 10: RL in games"
RL Foundations Study Group - "Lecture 9: Exploration and Exploitation" (part 2)
Akifumi Wachi presents "Safe RL"
RL Foundations Study Group - "Lecture 9: Exploration and Exploitation" (part 1)
RL Foundations Study Group - "Lecture 7: Policy Gradient"
RL Foundations Study Group - "Lecture 6: Value Function Approximation"
Costa Huang, Machine Learning Engineer at Hugging Face presents "Cleanba: A Reproducible Distributed Deep Reinforcement Learning Platform"
RL Foundations Study Group - "Lecture 3: Model Free Control"
Max Schwarzer discusses "Sample-Efficient RL Through Scaling". Max will present his work on two papers "Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier" and its successor "Bigger, Better, Faster: Human-Level Atari with Human-Level Efficiency."
RL Foundations Study Group - "Lecture 2: Dynamic Programming"
RL Foundations Study Group - "Lecture 1: Introduction to RL and MDP."
RL Learning Resources
General:
Credit Assignment:
https://uwixdsall4.joplinusercontent.com/shares/bzk5dLJN3VlH82TWc2woa3
On Learning Rewards:
Paper: On the Expressivity of Markov Reward
Reward is enough for convex MDPs https://arxiv.org/abs/2106.00661
Learning Resources:
https://rltheorybook.github.io/
https://simons.berkeley.edu/programs/DataDriven2022
https://onlinecourses.nptel.ac.in/noc19_cs55/preview
Course by Pascal Poupart
Key Papers in Deep RL:
https://spinningup.openai.com/en/latest/spinningup/keypapers.html