Online and Reinforcement Learning

In Block 3 of the spring semester of 2025, Online and Reinforcement Learning (OReL) will be taught by Christian Igel, Yevgeny Seldin, and Sadegh Talebi. The course covers theoretical foundations and algorithms of sequential decision making falling into online learning and reinforcement learning frameworks.

The course assumes knowledge of linear algebra, calculus, probability theory, and basic programming skills. Please, check the following self-assessment assignment to test whether you have the necessary qualifications for the course. If you have difficulties solving it, check our preparing yourself for the course page.

In the online learning part, we will cover the following basic settings, algorithms, and their analyses:

Basic concepts and definitions: notions of regret, types of feedback (full information vs. bandit)
Follow the Leader algorithm
Prediction with expert advice: the Hedge / Exponential Weights algorithm
Stochastic and adversarial multiarmed bandits: UCB1 and EXP3 algorithm
Contextual bandits: EXP4 algorithm

In the reinforcement learning part, we will cover the following basic settings, algorithms, and their analyses:

Markov Decision Processes (MDPs)
Monte Carlo Methods for reinforcement learning
Policy evaluation and off-policy evaluation (model-based methods and Temporal Difference)
Off-policy optimization (model-based methods, Q-learning, Double Q-learning)
Deep Learning (Deep Q-Learning, Proximal Policy Optimization)
Online reinforcement learning: performance measures
Online reinforcement learning: Regret minimization in average-reward MDPs
Online reinforcement learning: PAC exploration in discounted MDPs

Practicalities

Programming Language: The working language of the course is Python. See the programming exercise in the self-assessment assignment to verify whether you are ready.

Course dates: The next round of the course will run from the 4th of February 2025 till the 27th of March 2025.

Lectures: The lectures will take place on Thursdays 9:15-12:00 and 13:15-16:00. In addition, there will be a course introduction lectures on Tuesday, 4th of February, 9:15-12:00. The lectures will be held at University of Copenhagen (North Campus), but they will be streamed via Zoom and video recordings will be uploaded to the internal course page. This means that it is possible to take the course fully remotely.

TA classes: We expect to have 3-4 TA groups/classes depending on the number of registrations. We allow our students to join any TA class they like and if they need they can attend more than one TA session. The TA sessions will be 3 hours long and will focus on going through solutions of assignments that have been submitted as well as help with ongoing course material. Exact details about the time of TA classes will be provided later. We guarantee TA classes on Tuesdays (morning) and Fridays (afternoon), but there will be some other available time slots to accommodate those taking courses in the schedule as OReL. To support remote participation, we will have one TA session over Zoom (but with no recording).

Home assignments: There will be weekly home assignments including theoretical and practical questions. We expect to have 6-7 assignments in total.

Assessment: There will be no final exam. The final grade will be determined as an overall assessment, e.g., as average of the grades of the assignments excluding the one with the lowest grade.

Registration

The course welcomes applications from students enrolled at other universities as well as people from the industry. But it is also open to anyone interested in foundations of reinforcement learning. All elements of the course can be followed fully remotely, hence the course can in principle be taken by anyone.

Relevant registration links:

Credit Students (for those enrolled at a Danish educational institute)
Exchange Students (enrolled at a non-Danish educational institute with exchange agreement with University of Copenhagen)
Guest Students (enrolled at a non-Danish educational institute without exchange agreement with University of Copenhagen):
1. EU Students (enrolled at an EU/EEA or Swiss educational institute)
2. Non-EU Students (enrolled at an educational institute outside EU/EEA or Switzerland)
Continuing Education Applicants (for applicants from industry or individuals not enrolled at any educational institute, etc.)

Contact

In case of questions, please contact the course coordinator, Sadegh Talebi (m.shahi@di.ku.dk / mstalebi.edu@gmail.com).

Report abuse