Cours Master 2 Probabilités et Finance, Sorbonne Université et Ecole Polytechnique
Planning 2025/2026
Cours:
Mardi 13 janvier 2026, 9h-12h, salle 101, tour 15/25
Jeudi 15 janvier 2026, 9h-12h, salle 113, tour 16/26
Mardi 20 janvier 2026, 9h-12h, salle 101, tour 15/25
Mardi 10 février 2026, 9h-12h, salle 102, tour 15/25
Jeudi 12 février 2026, 9h-12h, salle 106, tour 14/15
TP: (Samy Mekkaoui et Alexandre Alouadi)
Mardi 27 janvier 2026, 9h-12h, salle 102, tour 15/25
Mardi 3 février 2026, 9h-12h, salle 102, tour 15/25
Jeudi 19 février 2026, 9h-12h, salle 106, tour 14/15
This course will present some recent developments on the interplay between optimal control and machine learning with emphasis on continuous time setting and financial applications. It will be illustrated with some practical works. Documents, slides and Labs posted on Moodle page
Part I: Foundations
Introduction to stochastic control and machine learning
Overview of stochastic control
ML meets stochastic control
Applications: finance, energy systems, robotics
Basics of Reinforcement learning (RL)
Fundamental concepts: MDP, value functions, Bellman equation
Value-based and policy-based algorithms
Transition to continuous-time RL: limitations of discrete time framework
Continuous time stochastic control RL
Control with randomized policies
HJB with entropy regularizer
Gibbs measure for the optimal randomized policy
Part II: Machine learning techniques for control
Neural networks algorithms for PDEs and HJB equations
Deep Galerkin and physics-informed NN
Deep BSDE
Deep backward dynamic programming scheme
Policy gradient methods in continuous time
Policy gradient representation
Actor/critic algorithms
Q-learning and approximations in continuous time
q-learning and Hamiltonian function
q-learning actor/critic algorithms
Part III: Generative modeling with Schrödinger bridge diffusion
Fundamentals of Schrödinger bridge (SB) and connection to optimal transport/control
Introduction to SB
Solution to SB, Schrödinger system and Sinkhorn algorithm
SB as generative model
Deep generative learning
Optimal interpolation for time series
References
Chen, Georgiou, Pavon, Stochastic control liaisons: Richard Sinkhorn meets Gaspard Monge on a Schrödinger bridge, 2021, SIAM Review, 63(2).
De Bortoli, Thornton, Hen, Doucet: Diffusion Schrödinger bridge with applications to score-based generative modeling, 2021, NeurIPS
Germain, Pham, Warin: Neural networks-based algorithms for stochastic control and PDEs, 2023, Machine learning and data sciences for financial markets: a guide to contemporary practices, CUP, eds. A. Capponi and C-A. Lehalle.
Hambly, Xu, Yang: Recent advances in Reinforcement learning in finance, 2021, arXiv:2112.04553
Hamdouche, Henry-Labordère, Pham: Generative modeling for time series via Schrödinger bridge, 2023, arXiv:2304.05093
Jia, Zhou: q-Learning in continuous time, 2023, JMLR
Sutton, Barto: Introduction to reinforcement learning, 2nd edition, 2016.
Wang, Jiao, Xu, Wang, Yang: Deep generative learning via Schrödinger bridge, ICML, 2021.
Wang, Zariphopoulou, Zhou: Reinforcement learning in continuous time and space: a stochastic control approach, JMLR, 2020, 21, 1-34