Reinforcement Learning

Channel: #reinforcement-learning

Co-leads:

Thang - @Thang on Discord, @ThangChu77 on Twitter
Rahul - @RaNa on Discord
Gusti - @Gusti Winata on Discord
Rassem - @Rassem on Discord

Goal:

We're initially going to build a robust foundation of Reinforcement Learning by studying one chapter of Sutton and Barto per week.
We aim to cover different levels of Reinforcement Learning, from introduction to advanced level, with both theoretical and practical perspective.
Create a space for constructive and engaging discussions with the aim to deepen the understand of Reinforcement Learning.

Logistics:

Anyone can present!
Pre-requisite: Preferably read the material ahead at least one week, but we will cover the prerequisite at the beginning.
Meeting: Introduction (10 mins), Main material (30-40 mins), Discussions (10-15 mins)

Occurrences: Bi-weekly, Saturdays 8:30 PM (GMT+7)

Recordings of Recent Sessions

C4AI - Reinforcement Learning Group (2024-07-13 15:37 GMT+1)

July 13, 2024

C4AI - Reinforcement Learning Group (2024-06-22 07:34 GMT-7)

June 22,2024

C4AI - Reinforcement Learning Group (2024-06-08 07:35 GMT-7)

June 8, 2024

C4AI - Reinforcement Learning Group (2024-05-18 07:35 GMT-7)

May 18, 2024

C4AI - Reinforcement Learning Group (2024-05-11 07:41 GMT-7)

May 11, 2024

C4AI - Reinforcement Learning Group (2024-04-13 07:36 GMT-7)

April 13, 2024

C4AI - Reinforcement Learning Group (2024-04-06 07:36 GMT-7)

April 6, 2024

C4AI - Reinforcement Learning Group (2024-03-30 05:53 GMT-7)

Thang and Raseem lead a discussion on "Policy Gradient Methods for Reinforcement Learning with Function Approximation" (https://proceedings.neurips.cc/paper/1999/file/464d828b85b0bed98e80ade0a5c43b0f-Paper.pdf)

C4AI - Reinforcement Learning Group (2024-02-24 05:48 GMT-8)

RL Foundations Study Group - "Lecture 10: RL in games"

C4AI - Reinforcement Learning Group (2024-02-17 05:38 GMT-8)

RL Foundations Study Group - "Lecture 9: Exploration and Exploitation" (part 2)

Akifumi Wachi presents "Safe RL" (Reinforcement Learning) (2024-02-13 00:06 GMT-8)

Akifumi Wachi presents "Safe RL"

C4AI - Reinforcement Learning Group (2024-02-10 05:40 GMT-8)

RL Foundations Study Group - "Lecture 9: Exploration and Exploitation" (part 1)

C4AI - Reinforcement Learning Group (2023-11-25 05:40 GMT-8)

RL Foundations Study Group - "Lecture 7: Policy Gradient"

stc-iaig-prb (2023-10-08 12:36 GMT-7)

RL Foundations Study Group - "Lecture 6: Value Function Approximation"

Costa Huang - Cleanba: A Reproducible Distributed Deep Reinforcement Learning Platform (RL Group) (2023-10-02 11:31 GMT-7)

Costa Huang, Machine Learning Engineer at Hugging Face presents "Cleanba: A Reproducible Distributed Deep Reinforcement Learning Platform"

stc-iaig-prb (2023-09-24 20:34 GMT+1)

RL Foundations Study Group - "Lecture 3: Model Free Control"

Max Schwarzer - Sample-Efficient RL Through Scaling (RL Group) (2023-09-15 13:01 GMT-7)

Max Schwarzer discusses "Sample-Efficient RL Through Scaling". Max will present his work on two papers "Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier" and its successor "Bigger, Better, Faster: Human-Level Atari with Human-Level Efficiency."

C4AI - Reinforcement Learning Group (2023-08-10 07:35 GMT-4)

RL Foundations Study Group - "Lecture 2: Dynamic Programming"

C4AI - Reinforcement Learning Group (2023-07-27 07:37 GMT-4)

RL Foundations Study Group - "Lecture 1: Introduction to RL and MDP."

RL Learning Resources

General:

https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/article/building-machines-that-learn-and-think-like-people/A9535B1D745A0377E16C590E14B94993

Credit Assignment:

https://uwixdsall4.joplinusercontent.com/shares/bzk5dLJN3VlH82TWc2woa3

On Learning Rewards:

Paper: On the Expressivity of Markov Reward

Reward is enough for convex MDPs https://arxiv.org/abs/2106.00661