2/13/2015

Post date: Feb 15, 2015 7:12:43 PM

Speaker: Ravi Varadhan, Division of Biostatistics and Bioinformatics, SKCCC, JHU

"Part 1: EM, MM, and other monotone algorithms and their slow convergence"

[abstract]

The expectation-maximization (EM) algorithm (Dempster et al., 1977) is a highly popular approach for obtaining maximum likelihood estimates (MLEs) in incomplete data problems occurring in a wide array of applications. The key notion in the use of EM is that while the direct maximization of the likelihood for the observed data can be difficult, augmenting the observed data with the missing information results in a simpler maximization problem. The popularity of EM is due to its stability (monotonic increase of likelihood). However, in many applications the stability of EM is achieved at the expense of slow, linear convergence, where the rate of convergence is inversely related to the proportion of missing information. Such slow convergence limits the ability of the analyst to perform computer intensive tasks such as model exploration and validation, sensitivity analysis, bootstrapping and simulations. Slow convergence also limits the usefulness of EM, especially in modern applications with complex statistical models with high-dimensional parameter space and large-scale data.

The EM algorithm may be viewed as a special case of a more general algorithm called the MM algorithm. Specific MM algorithms often have nothing to do with missing data. The first M step of an MM algorithm creates a surrogate function that is optimized in the second M step. In minimization, MM stands for majorize–minimize; in maximization, it stands for minorize–maximize. This two-step process always drives the objective function in the desired direction. Construction of MM algorithms relies on recognizing and manipulating inequalities rather than calculating conditional expectations. MM algorithms are much more recent, but, like EM, they are afflicted by slow, linear convergence.