Reinforcement Learning for Processing Networks seminar
We meet in 502 Dao Yuan bldg at 4:00pm - 6:00pm on Mondays.
This week:
This week:
Schedule:
Schedule:
Previous presentations:
Previous presentations:
Title: Algorithms for exploration in reinforcement learning
Discussion leaders: Xiuyuan (Lucy) Lu
Date: 08/06/2018
References:
- Jaksch, Ortner, Auer (2010). “Near-optimal Regret Bounds for Reinforcement Learning”. Journal of Machine Learning Research 11, pp 1563-1600. Link
- Osband, Russo, Van Roy (2013). "(More) Efficient Reinforcement Learning via Posterior Sampling". Link
- Osband, Van Roy (2017) "Why is Posterior Sampling Better than Optimism for Reinforcement Learning?" Proceedings of the 34th International Conference on Machine Learning. Link
- Osband, Van Roy, Russo, Wen (2018) "Deep Exploration via Randomized Value Functions". Link
presentation.pdf
Title: An Information-Theoreric Analysis of Thompson Sampling
Discussion leaders: Xiuyuan (Lucy) Lu and Vikranth Dwaracherla
Date: 07/23/2018
References:
- Russo, D. and B. Van Roy. 2016. “An Information-Theoretic analysis of Thompson sampling”. Journal of Machine Learning Research. 17(68): 1–30. Link
information-theoretic-analysis.pdf
Title: Inpatient Overflow: An Approximate Dynamic Programming Approach
Discussion leader: Pengyi Shi
Date: 07/16/2018
References:
- J. Dai, P. Shi, Inpatient Overflow: An Approximate Dynamic Programming Approach (February 20, 2018). Forthcoming, Manufacturing and Service Operations Management (MSOM). Link
CUHK_readingSeminar.pdf
Title: A Finite Time Analysis of TD Learning using with Linear Function Approximation
Discussion leader: Mark Gluzman
Date: 07/09/2018
References:
- J. Bhandari, D.Russo, R. Singal (2018) A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation, https://arxiv.org/abs/1806.02450 Link
RLPN presentation3.pdf