Cours Master 2 Probabilités et Finance, Sorbonne Université et Ecole Polytechnique
This course will present some recent developments on the interplay between optimal control and machine learning with emphasis on continuous time setting and financial applications.
Part I: Foundations
Introduction to stochastic control and machine learning
Overview of stochastic control
ML meets stochastic control
Applications: finance, energy systems, robotics
Basics of Reinforcement learning (RL)
Fundamental concepts: MDP, value functions, Bellman equation
Value-based and policy-based algorithms
Transition to continuous-time RL: limitations of discrete time framework
Continuous time stochastic control RL
Control with randomized policies
HJB with entropy regularizer
Gibbs measure for the optimal randomized policy
Part II: Machine learning techniques for control
Neural networks algorithms for PDEs and HJB equations
Deep Galerkin and physics-informed NN
Deep BSDE
Deep backward dynamic programming scheme
Policy gradient methods in continuous time
Policy gradient representation
Actor/critic algorithms
Q-learning and approximations in continuous time
q-learning and Hamiltonian function
q-learning actor/critic algorithms
Part III: Generative modeling with Schrödinger bridge diffusion
Fundamentals of Schrödinger bridge (SB) and connection to optimal transport/control
Introduction to SB
Solution to SB, Schrödinger system and Sinkhorn algorithm
SB as generative model
Deep generative learning
Optimal interpolation for time series
References
Chen, Georgiou, Pavon, Stochastic control liaisons: Richard Sinkhorn meets Gaspard Monge on a Schrödinger bridge, 2021, SIAM Review, 63(2).
Germain, Pham, Warin: Neural networks-based algorithms for stochastic control and PDEs, 2023, Machine learning and data sciences for financial markets: a guide to contemporary practices, CUP, eds. A. Capponi and C-A. Lehalle.
Hambly, Xu, Yang: Recent advances in Reinforcement learning in finance, 2021, arXiv:2112.04553
Hamdouche, Henry-Labordère, Pham: Generative modeling for time series via Schrödinger bridge, 2023, arXiv:2304.05093
Jia, Zhou: q-Learning in continuous time, 2023, JMLR
Wang, Jiao, Xu, Wang, Yang: Deep generative learning via Schrödinger bridge, ICML, 2021.
Wang, Zariphopoulou, Zhou: Reinforcement learning in continuous time and space: a stochastic control approach, JMLR, 2020, 21, 1-34