Guy Bresler. Bonnie and Marty (1964) Tenenbaum Career Development Assistant Professor in EECS, MIT.
Title: Community Detection and Invariance to Distribution
Abstract: We consider the problem of recovering a hidden community of size K from a graph where edges between members of the community have label X drawn i.i.d. according to P and all other edges have labels drawn i.i.d. according to Q. The information limits for this problem were characterized by Hajek-Wu-Xu in 2016 in terms of the KL-divergence between P and Q. We complement their work by showing that for a broad class of distributions P and Q the computational difficulty is also determined by the KL-divergence. We additionally show how to reduce general P and Q to the case P = Ber(p) and Q = Ber(q) and vice versa, giving a direct computational equivalence (up to polynomial time).
Arthur Gretton. Reader (Associate Professor), Gatsby Computational Neuroscience Unit, UCL.
Title: Conditional Densities and Efficient Models in Infinite Exponential Families
Abstract: The exponential family is one of the most powerful and widely used classes of models in statistics. A method was recently developed to fit this model when the natural parameter and sufficient statistic are infinite dimensional, using a score matching approach. The infinite exponential family is a natural generalisation of the finite case, much like the Gaussian and Dirichlet processes generalise their respective finite modfels.
In this talk, I'll describe two recent results which make this model more applicable in practice, by reducing the computational burden and improving performance for high-dimensional data. The firsrt is a Nytsrom-like approximation to the full solution. We prove that this approximate solution has the same consistency and convergence rates as the full-rank solution (exactly in Fisher distance, and nearly in other distances), with guarantees on the degree of cost and storage reduction. The second result is a generalisation of the model family to the conditional case, again with consistency guarantees. In experiments, the conditional model generally outperforms a competing approach with consistency guarantees, and is competitive with a deep conditional density model on datasets that exhibit abrupt transitions and heteroscedasticity.
Negar Kiyavash. Associate Professor, ECE, UIUC.
Title: Recovering Latent Causal Relations from Times Series Data
Abstract: Discovering causal relationships from data is a challenging problem that is exacerbated when some of the variables of interests are latent. In this talk, we discuss the problem of learning the support of transition matrix between random processes in a Vector Autoregressive (VAR) model from samples when a subset of the processes are latent. It is well known that ignoring the effect of the latent processes may lead to very different estimates of the influences even among observed processes. We are not only interested in identifying the influences among the observed processes, but also aim at learning those between the latent ones, and those from the latent to the observed ones. We show that the support of transition matrix among the observed processes and lengths of all latent paths between any two observed processes can be identified successfully under some conditions on the VAR model. Our results apply to both non-Gaussian and Gaussian cases, and experimental results on various synthetic and real-world datasets validate our theoretical findings.
Caroline Uhler. Henry L. And Grace Doherty Assistant Professor, IDSS, MIT.
Title: Your dreams may come true with MTP2
Abstract: We study maximum likelihood estimation for exponential families that are multivariate totally positive of order two (MTP2). Such distributions appear in the context of ferromagnetism in the Ising model and various latent models, as for example Brownian motion tree models used in phylogenetics. We show that maximum likelihood estimation for MTP2 exponential families is a convex optimization problem. For quadratic exponential families such as Ising models and Gaussian graphical models, we show that MTP2 implies sparsity of the underlying graph without the need of a tuning parameter. In addition, we characterize a subgraph and a supergraph of Gaussian graphical models under MTP2. Moreover, we show that the MLE always exists even in the high-dimensional setting. These properties make MTP2 constraints an intriguing alternative to methods for learning sparse graphical models such as the graphical lasso.
Tandy Warnow. Founder Professor in Bioengineering and Computer Science, UIUC.
Title: Mathematical and computational challenges in Reconstructing Evolution
Abstract: Reconstructing evolutionary histories is a basic step in much biological discovery, as well as in historical linguistics and other domains. Inference methods based on mathematical models of evolution have been used to make substantial advances, including in understanding the early origins of life, to predicting protein structures and functions, and to addressing questions such as "Where did the Indo-European languages begin?" In this talk, I will describe the current state of the art in phylogeny estimation in these domains, what is understood from a mathematical perspective, and identify fascinating open problems where novel mathematical research - drawing from graph theory, algorithms, and probability theory - is needed. This talk will be accessible to mathematicians, computer scientists, and probabilists, and does not require any knowledge in biology.