Stochastic approximation methods (SA), originally proposed more than 70 years ago by Robbins and Monro, have emerged as important components in signal processing (SP), machine learning (ML), and are used in a wide variety of applications. Among the myriad applications, most existing studies have focused on the gradient-based SA algorithms, commonly referred to as stochastic gradient algorithms (SG). However, this perspective overlooks the rich diversity and generality inherent in SA schemes. The latter actually encompasses broader scenarios in which the goal goes beyond solving optimization problems driven by a gradient vector field.
This tutorial provides a comprehensive introduction to the stochastic approximation (SA) scheme, especially with a focus on treating them as general non-SG algorithms:
The first part of the tutorial serves as an essential foundation, acquainting participants with the background and rationale of the SA scheme. It establishes a connection between SA and the challenge of locating roots of mean field functions. Moreover, we discuss practical applications of the SA scheme. Through illustrative examples drawn from the realms of signal processing and machine learning, the versatility of SA in analyzing and understanding classical algorithms will be showcased.
The second part of the tutorial delves deeper into the presented application examples, inspiring the introduction of a general convergence theory of SA algorithms. This broadens the scope of SA, accommodating scenarios with non-gradient updates and biased oracles. We will emphasize on modern convergence analyses such as explicit control on the sample complexity of SA.
The final section of the tutorial introduces participants to contemporary developments within the realm of SA. Prominent among these advances are the reduction of computational complexity through applying variance reduction to general SA, federated or distributed extensions of SA, Markovian noise, and tight high-probability bounds. By incorporating these latest innovations, participants will be well equipped to explore emerging frontiers in applying SA in their research.
This tutorial is based upon our Overview Paper published in IEEE Transactions on Signal Processing in 2023.
-- Opening (5 minutes) --
[Speaker: Gersende Fort & Hoi-To Wai]
[Speaker: Eric Moulines & Gersende Fort & Hoi-To Wai]
-- Coffee Break (30 minutes) --
[Speaker: Eric Moulines & Gersende Fort & Hoi-To Wai]
-- Conclusions & The End --
Contact: Gersende Fort [gersende dot fort at math dot univ-toulouse dot fr], Eric Moulines [eric dot moulines at polytechnique dot edu], Hoi-To Wai [htwai at cuhk dot edu dot hk]