Reinforcement Learning for Processing Networks seminar

We meet in 502 Dao Yuan bldg at 4:00pm - 6:00pm on Mondays.

Title: Algorithms for exploration in reinforcement learning

Discussion leaders: Xiuyuan (Lucy) Lu

Date: 08/06/2018

References:

Jaksch, Ortner, Auer (2010). “Near-optimal Regret Bounds for Reinforcement Learning”. Journal of Machine Learning Research 11, pp 1563-1600. Link
Osband, Russo, Van Roy (2013). "(More) Efficient Reinforcement Learning via Posterior Sampling". Link
Osband, Van Roy (2017) "Why is Posterior Sampling Better than Optimism for Reinforcement Learning?" Proceedings of the 34th International Conference on Machine Learning. Link
Osband, Van Roy, Russo, Wen (2018) "Deep Exploration via Randomized Value Functions". Link

presentation.pdf

Title: An Information-Theoreric Analysis of Thompson Sampling

Discussion leaders: Xiuyuan (Lucy) Lu and Vikranth Dwaracherla

Date: 07/23/2018

References:

Russo, D. and B. Van Roy. 2016. “An Information-Theoretic analysis of Thompson sampling”. Journal of Machine Learning Research. 17(68): 1–30. Link

information-theoretic-analysis.pdf

Title: Inpatient Overflow: An Approximate Dynamic Programming Approach

Discussion leader: Pengyi Shi

Date: 07/16/2018

References:

J. Dai, P. Shi, Inpatient Overflow: An Approximate Dynamic Programming Approach (February 20, 2018). Forthcoming, Manufacturing and Service Operations Management (MSOM). Link

CUHK_readingSeminar.pdf

Title: A Finite Time Analysis of TD Learning using with Linear Function Approximation

Discussion leader: Mark Gluzman

Date: 07/09/2018

References:

J. Bhandari, D.Russo, R. Singal (2018) A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation, https://arxiv.org/abs/1806.02450 Link

RLPN presentation3.pdf