May 31: Bayesian Model Selection with data dependent priors

Post date: May 2, 2012 4:53:43 PM

Marc Boulle

This talk studies the asymptotic consistency of the data grid models applied to the joint density estimation of two categorical variables.

The data grid models consider the grouping of the values of each variable. The Cartesian product of these partitions

forms a grid whose cells provide a summary of the contingency table of the two variables.

These models can be interpreted both as joint density estimators and as coclustering models, with numerous applications in domains such as text mining, web mining, graph mining, marketing.

The best bivariate grouping model is selected by the mean of a MAP (maximum a posteriori) approach, with the heretic property of exploiting both a model family and a prior distribution that are data dependent.

These models are in essence models of the data sample, not of the underlying distribution.

We demonstrate the consistency of the approach, which behaves as a universal estimator of joint density that asymptotically converges towards the true joint distribution.