Estimating mutual information

Post date: Feb 7, 2013 9:24:54 PM

Normally, you might consider computing mutual information (MI) based on the pdfs estimated from kernel pdf estimation methods like Gaussian kernel estimation using Parzen's window. However, such approaches are quite slow especially when the dimension of the instances is high. Therefore, it might be a better idea to use the estimator of MI instead, and here is an interesting approach:

"Estimating mutual information" by Alexander Kraskov, Harald Stögbauer, and Peter Grassberger

http://arxiv.org/pdf/cond-mat/0305641.pdf

The paper provides a nice way to estimate MI(X,Y) using k-nearest neighbor, where X is of d-dim and Y is a continuous variable. For classification application, Y will be mostly discrete, and such version of MI is formulated in the paper below:

"Mutual information-based feature selection enhances fMRI brain activity classification" by Vincent Michel, Cécilia Damon, Bertrand Thirion:

http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4541065

or [pdf]

There is a very rich and informative MATLAB toolbox:

https://bitbucket.org/szzoli/ite/