Machine learning and stochastic control

Cours Master 2 Probabilités et Finance, Sorbonne Université et Ecole Polytechnique

This course will present some recent developments on the interplay between optimal control and machine learning with emphasis on continuous time setting and financial applications.

Part I: Foundations

Introduction to stochastic control and machine learning

Overview of stochastic control
ML meets stochastic control
Applications: finance, energy systems, robotics

Basics of Reinforcement learning (RL)

Fundamental concepts: MDP, value functions, Bellman equation
Value-based and policy-based algorithms
Transition to continuous-time RL: limitations of discrete time framework

Continuous time stochastic control RL

Control with randomized policies
HJB with entropy regularizer
Gibbs measure for the optimal randomized policy

Part II: Machine learning techniques for control

Neural networks algorithms for PDEs and HJB equations

Deep Galerkin and physics-informed NN
Deep BSDE
Deep backward dynamic programming scheme

Policy gradient methods in continuous time

Policy gradient representation
Actor/critic algorithms

Q-learning and approximations in continuous time

q-learning and Hamiltonian function
q-learning actor/critic algorithms

Part III: Generative modeling with Schrödinger bridge diffusion

Fundamentals of Schrödinger bridge (SB) and connection to optimal transport/control

Introduction to SB
Solution to SB, Schrödinger system and Sinkhorn algorithm

SB as generative model

Deep generative learning
Optimal interpolation for time series

References

Chen, Georgiou, Pavon, Stochastic control liaisons: Richard Sinkhorn meets Gaspard Monge on a Schrödinger bridge, 2021, SIAM Review, 63(2).
Germain, Pham, Warin: Neural networks-based algorithms for stochastic control and PDEs, 2023, Machine learning and data sciences for financial markets: a guide to contemporary practices, CUP, eds. A. Capponi and C-A. Lehalle.
Hambly, Xu, Yang: Recent advances in Reinforcement learning in finance, 2021, arXiv:2112.04553

Hamdouche, Henry-Labordère, Pham: Generative modeling for time series via Schrödinger bridge, 2023, arXiv:2304.05093
Jia, Zhou: q-Learning in continuous time, 2023, JMLR
Wang, Jiao, Xu, Wang, Yang: Deep generative learning via Schrödinger bridge, ICML, 2021.
Wang, Zariphopoulou, Zhou: Reinforcement learning in continuous time and space: a stochastic control approach, JMLR, 2020, 21, 1-34

Page updated

Google Sites

Report abuse