Python (scientific computing): Focus on array operations, broadcasting, plotting, and basic linear algebra using NumPy.
Python basics: https://www.w3schools.com/python/
NumPy, SciPy, and Matplotlib: https://python-course.eu/numerical-programming/introduction-to-numpy.php
PyTorch: Focus on tensors, automatic differentiation, neural network modules, and training loops.
Example PyTorch code: https://github.com/yunjey/pytorch-tutorial
PyTorch tutorials: https://docs.pytorch.org/tutorials/index.html
Linear Algebra: Focus on vectors, matrices, eigenvalues, norms, and gradients.
MIT OCW 18.06 Linear Algebra: https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/
Khan Academy Linear Algebra: https://www.khanacademy.org/math/linear-algebra
Concise lecture notes: https://drive.google.com/file/d/1avcmfGNo_WsuG_e0UhkByzGmGjyZ5EqR/view
Probability: Focus on random variables, expectation, variance, conditional probability, and common distributions.
Probability notes: https://drive.google.com/file/d/1WbwDVSIaWLm84D8t6WyQpI_ApIofbfng/view
General reinforcement learning:
Reinforcement Learning: An Introduction by Richard Sutton
OpenAI Spinning Up in Deep Reinforcement Learning: https://spinningup.openai.com/en/latest/
Reinforcement learning taxonomy: Reinforcement learning improves behaviour from evaluative feedback
Multi-armed bandits:
Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto, Chapter 2
Bandit Algorithms by Tor Lattimore and Csaba Szepesv´ari, Chapters 1, 4-7, 11, 19 (Introduction to bandits; regret; concentration inequalities; Explore-then-Commit; Upper confidence bounds, EXP3, linUCB)
Thompson Sampling
linUCB
Model-free Reinforcement learning: Q-function based methods
Reinforcement Learning: An Introduction by Richard Sutton, Ch. 3-7, 12
Model-free Reinforcement learning: Policy Gradients
Model-based Reinforcement Learning
Learning from demonstration
Offline RL:
Advantage Weighted Actor Critic: AWAC
Conservative Q-learning (CQL)