where the next iterate, θ_n, appears on both sides. This adds stability and robustness. The contrast SGD-ISGD has natural connections to :
- forward-backward Euler methods in numerical analysis;
- standard-proximal methods in optimization; and
- LMS-NLMS in signal processing.
The Proximal Robbins-Monro Method, (2019, pdf
)
Asymptotic and finite-sample properties of estimators based on stochastic gradients, (Annals of Statistics, 2017, pdf
, slides
)
Statistical analysis of stochastic gradient methods for generalized linear models, (ICML' 14, pdf
, slides
)
Scalable estimation strategies based on stochastic approximations: Classical results and new insights, (Statistics and Computing, 2015, pdf
)
Stochastic gradient methods for principled estimation with large datasets, (Handbook of Big Data, 2016, eds. Buhlmann et. al., www
)
sgd R package: CRAN https://cran.r-project.org/web/packages/sgd/index.html
GitHub https://github.com/airoldilab/sgd