where the next iterate, θ_n, appears on both sides. This adds stability and robustness. The contrast SGD-ISGD has natural connections to :
- forward-backward Euler methods in numerical analysis;
- standard-proximal methods in optimization; and
- LMS-NLMS in signal processing.
The Proximal Robbins-Monro Method, (2019, pdf)
Asymptotic and finite-sample properties of estimators based on stochastic gradients, (Annals of Statistics, 2017, pdf, slides)
Statistical analysis of stochastic gradient methods for generalized linear models, (ICML' 14, pdf, slides)
Scalable estimation strategies based on stochastic approximations: Classical results and new insights, (Statistics and Computing, 2015, pdf)
Stochastic gradient methods for principled estimation with large datasets, (Handbook of Big Data, 2016, eds. Buhlmann et. al., www)
sgd R package: CRAN https://cran.r-project.org/web/packages/sgd/index.html
GitHub https://github.com/airoldilab/sgd