The story of EM and variational Bayes

Post date: Jun 02, 2015 10:30:7 PM

EM and variational Bayes [1] share the same math trick that decompose any log probability into a sum of a free-energy term, denoted , and a KL div term.

For EM, the goal is to maximize the intractable log-likelihood . It follows that

       (the left hand side is const wrt the expectation)

where is the free energy.

 (This is the (conditional) expectation step). Then,

   (This is the maximization step)

For variational Bayes, the goal is to approximate the intractable posterior via KL div minimization. Consider the log evidence

     (the left hand side is const wrt the expectation)

where is the free energy.

Thus minimizing can be done by maximizing  wrt , which is tractable because it is independent of the intractable

[1] Kay Brodersen, http://people.inf.ethz.ch/bkay/talks/Brodersen_2013_03_22.pdf