Overview‎ > ‎

Schedule and abstracts

7:45- 8:00 - Overview by Organizers 
 8:00- 8:30 - Invited Talk: Kernel Bayes Rule (Kenji Fukumizu, The Institute of Statistical Mathematics) 
Talk Abstract:
A nonparametric kernel-based method for realizing Bayes’ rule is proposed, based on representations of probabilities in reproducing kernel Hilbert spaces. Probabilities are uniquely characterized by the mean of the canonical map to the RKHS. The prior and conditional probabilities are expressed in terms of RKHS functions of an empirical sample: no explicit parametric model is needed for these quantities. The posterior is likewise an RKHS mean of a weighted sample. The estimator for the expectation of a function of the posterior is derived, and rates of consistency are shown. Some representative applications of the kernel Bayes’ rule are presented, including Baysian computation without likelihood and filtering with a nonparametric state-space model.

 8:40- 9:00 - Talk: FastFood: Approximating Kernel Expansion in Loglinear Time (Alex Smola, Research at Google) 
Talk Abstract: 
The ability to evaluate nonlinear function
classes rapidly is crucial for nonparametric
estimation. We propose an improvement to
random kitchen sinks that offers O(n log d)
computation and O(n) storage for n basis
functions in d dimensions without sacrificing
accuracy. We show how one may adjust the
regularization properties of the kernel simply
by changing the spectral distribution of the
projection matrix. Experiments show that
we achieve identical accuracy to full kernel
expansions and random kitchen sinks 100x
faster with 1000x less memory.
 8:40- 9:00 - Coffee Break
 9:00- 9:30 - Invited Talk: Small-Variance Asymptotics, Nonparametric Bayes, and Kernel k-means (Brian Kulis, Ohio State University)
Talk Abstract:
It is well known that mixture-of-Gaussians and k-means are related through asymptotics on the variance of the clusters---as the variance tends to zero, the EM algorithm becomes the k-means algorithm, and the complete-data log likelihood becomes the k-means objective function.  As shown recently, such asymptotics can also be applied to Bayesian nonparametric models, leading to simple and scalable k-means-like algorithms for a host of problems including clustering, latent feature models, topic models, and others.  In this talk, I will overview these results, with a focus on the connections to kernel methods.  In particular, I will discuss how an existing equivalence between kernel k-means and graph clustering can be used in conjunction with the asymptotics of Bayesian nonparametric models to obtain a class of novel and scalable kernel-based algorithms for problems such as overlapping graph clustering and graph clustering when the number of clusters is not fixed. 

9:30-10:00 - Invited Talk: Kernel methods in nonparametric Bayesian models (Lawrence Carin, Duke University) 

Talk Abstract: 
For handling large-scale problems, methods like Gaussian processes can be computationally challenging. In this paper, we discuss how use of alternative kernel methods can be employed to accelerate computations, without loss of modeling power. We examine this in the context of general nonparametric Bayesian models, with specific applications within the Beta process. The theoretical and algorithmic issues are discussed, with demonstration via several examples.

10:00-10:15 - Contributed Talk: Kernel Embeddings of Dirichlet Process Mixtures (Krikamol Muandet, Max Plank Institute of Biological Cybernetics) 

10:15 - 16:00  Break

16:00-16:10 - Contributed Talk: Kernel Methods for Learning Motion Patterns (Lachlan McCalman, University of Sydney)  
16:10-16:20 - Contributed Talk: Kernels for Protein Structure Prediction (Narges Razavian, Carnegie Mellon University)
16:20-16:30 - Short Coffee Break
16:30-17:00 - Invited Talk: Kernel Topic Models (Thore Graepel, Microsoft Research Cambridge)
Talk Abstract:
Latent Dirichlet Allocation models discrete data as a mixture of discrete distributions, using Dirichlet beliefs over the mixture weights. We study a variation of this concept, in which the documents' mixture weight beliefs are replaced with squashed Gaussian distributions. This allows documents to be associated with elements of a Hilbert space, admitting kernel topic models (KTM), modelling temporal, spatial, hierarchical, social and other structure between documents. The main challenge is effiicient approximate inference on the latent Gaussian. We present an approximate algorithm cast around a Laplace approximation in a transformed basis. The KTM can also be interpreted as a type of Gaussian process latent variable model, or as a topic model conditional on document features, uncovering links between earlier work in these areas. This is joint work with Philipp Hennig (first author), David Stern, and Ralf Herbrich.
17:00-17:30 - Invited Talk: Nonparametric Variational Inference (Matt Hoffman, Adobe)
Talk Abstract:
Variational methods are widely used for approximate posterior inference. However, their use is typically limited to families of distributions that enjoy particular conjugacy properties. To circumvent this limitation, we propose a family of variational approximations inspired by nonparametric kernel density estimation. The locations of these kernels and their bandwidth are treated as variational parameters and optimized to improve an approximate lower bound on the marginal likelihood of the data. Unlike most other variational approximations, using multiple kernels allows the approximation to capture multiple modes of the posterior. We demonstrate the e.cacy of the nonparametric approximation with a hierarchical logistic regression model and a nonlinear matrix factorization model. We obtain predictive performance as good as or better than more specialized variational methods and MCMC approximations. The method is easy to apply to graphical models for which standard variational methods are difficult to derive.
17:30-18:00 - Coffee Break
18:00-18:30 - Invited Talk: Determinantal Point Processes (Ben Taskar, University of Pennsylvannia)
Talk Abstract: 
Determinantal point processes (DPPs) arise in random matrix theory and quantum physics as models of random variables with negative correlations. Among many remarkable properties, they offer tractable algorithms for exact inference, including computing marginals, computing certain conditional probabilities, and sampling. DPPs are a natural model for subset selection problems where diversity is preferred. For example, they can be used to select diverse sets of sentences to form document summaries, or to return relevant but varied text and image search results, or to detect non-overlapping multiple object trajectories in video. In our recent work, we discovered a novel factorization and dual representation of DPPs that enables efficient inference for exponentially-sized structured sets. We developed a new inference algorithm based on Newton identities for DPPs conditioned on subset size. We also derived efficient parameter estimation for DPPs from several types of observations. We demonstrated the advantages of the model on several natural language and vision tasks: extractive document summarization, diversifying image search results and multi-person articulated pose estimation problems in images.
18:30-19:00 - Invited Talk: Bayesian Interpretations and Extensions of Kernel Mean Embedding Methods (David Duvenaud, Cambridge University)
Talk Abstract: 
We give a simple interpretation of mean embeddings as expectations under a Gaussian process prior. Methods such as kernel two-sample tests, the Hilbert-Schmidt Independence Criterion, and kernel herding are all based on distances between mean embeddings, also known as the Maximum Mean Discrepancy (MMD). This Bayesian interpretation allows a derivation of optimal herding weights, principled methods of kernel learning, and sheds light on the assumptions necessary for MMD-based methods to work in practice. In the other direction, the MMD interpretation gives tight, closed-form bounds on the error of Bayesian estimators.
19:00-19:30 - Open Discussion on Current Challenges and Future Directions