SIG of Reinforcement Learning

"AWAC: Accelerating Online Reinforcement Learning with Offline Datasets" [slides]

Nair, Ashvin, et al. "Awac: Accelerating online reinforcement learning with offline datasets." arXiv preprint arXiv:2006.09359(2020).
Junseok Lee, 2026-02-23
Takeaway
- AWAC enables stable and data-efficient online fine-tuning by combining off-policy critic learning with advantage-weighted policy updates.

"Off-Policy Deep Reinforcement Learning without Exploration" [slides]

Fujimoto, Scott, David Meger, and Doina Precup. "Off-policy deep reinforcement learning without exploration." International conference on machine learning. PMLR, 2019.
Junseok Lee, 2026-01-28
Takeaway
- BCQ stabilizes offline Q-learning by restricting policy actions to the support of the behavior dataset.

"CS234 Reinforcement Learning" [link]

Lectured by prof. Emma Brunskill
Stanford (US), 2019 winter.

"Lecture notes for Reinforcement Learning" [link]

Composed by prof. Haim Permuter
Ben Gurion University (Israel)

Google Sites

Report abuse