Reinforcement Learning for Processing Networks seminar

We meet in 502 Dao Yuan bldg at 4:00pm - 6:00pm on Mondays.

This week:


Previous presentations:

Title: Algorithms for exploration in reinforcement learning

Discussion leaders: Xiuyuan (Lucy) Lu

Date: 08/06/2018


  • Jaksch, Ortner, Auer (2010). “Near-optimal Regret Bounds for Reinforcement Learning”. Journal of Machine Learning Research 11, pp 1563-1600. Link
  • Osband, Russo, Van Roy (2013). "(More) Efficient Reinforcement Learning via Posterior Sampling". Link
  • Osband, Van Roy (2017) "Why is Posterior Sampling Better than Optimism for Reinforcement Learning?" Proceedings of the 34th International Conference on Machine Learning. Link
  • Osband, Van Roy, Russo, Wen (2018) "Deep Exploration via Randomized Value Functions". Link

Title: An Information-Theoreric Analysis of Thompson Sampling

Discussion leaders: Xiuyuan (Lucy) Lu and Vikranth Dwaracherla

Date: 07/23/2018


  • Russo, D. and B. Van Roy. 2016. “An Information-Theoretic analysis of Thompson sampling”. Journal of Machine Learning Research. 17(68): 1–30. Link

Title: Inpatient Overflow: An Approximate Dynamic Programming Approach

Discussion leader: Pengyi Shi

Date: 07/16/2018


  • J. Dai, P. Shi, Inpatient Overflow: An Approximate Dynamic Programming Approach (February 20, 2018). Forthcoming, Manufacturing and Service Operations Management (MSOM). Link

Title: A Finite Time Analysis of TD Learning using with Linear Function Approximation

Discussion leader: Mark Gluzman

Date: 07/09/2018


  • J. Bhandari, D.Russo, R. Singal (2018) A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation, Link
RLPN presentation3.pdf