Implicit stochastic gradient descent (ISGD) is a variant to standard SGD, whereby we use implicit updates in the main update of the algorithm. This adds numerical stability and robustness to learning rate specification, which is a known problem with the standard method.

Specifically, standard SGD performs updates of the form: