Past Talks and Recordings

Monday, May 24, 2021, Georgy Scholten (NCSU):

- Speaker: Georgy Scholten (NCSU)
- Title: Moment cones and the coalescence manifold
- Abstract: The main topics of this talk are cones of univariate moments generated by piecewise-constant density functions on the interval [0,1] and how they appear in inference problems in population genetics. We show that, up to closure, any collection of n-th moments is achieved by a step function with at most n-1 breakpoints and that this bound is tight. Both the moment cones and the coalescence manifold are projected spectrahedra and we describe the problem of finding a nearest point on them as a semidefinite program.

Georgy.pdf

Monday, May 10, 2021, Liam Solus (KTH):

- Speaker: Liam Solus (KTH)
- Title: On the Geometry of Causal Discovery
- Abstract: The basic problem of causal discovery is to learn, in the form of a directed acyclic graph (DAG), the cause-effect relations that hold within a system of random variables based solely on data drawn from their underlying joint distribution. While a variety of approaches to this problem have been explored, algorithms that greedily transform one candidate DAG into another given a fixed set of moves have been particularly successful. In this talk, we will observe that the standard greedy approaches to causal discovery are all one and the same from a geometric perspective: they are each a different type of greedy edge-walk along a specific convex polytope. We will see how the method of proof for this result offers natural extensions of classical greedy algorithms, extensions that tend to perform better on simulated data. We will end with a discussion of the doors opened by this result: a new direction of research in discrete geometry for better causal discovery methods.


Liam.pdf

Monday, April 26, 2021, Jonathan Niles-Weed (NYU):

- Speaker: Jonathan Niles-Weed (NYU)
- Title: Matrix concentration for products and the streaming k-PCA problem
- Abstract: We develop nonasymptotic concentration bounds for products of independent random matrices. Our bounds exactly match those available for scalar random variables and continue the program, initiated by Ahlswede-Winter, Oliveira, and Tropp, of extending familiar concentration bounds to the noncommutative setting. The argument relies on the optimal uniform smoothness properties of the Schatten trace class. We apply our results to analyze a non-convex stochastic descent algorithm for streaming PCA due to Erkki Oja. Despite its simplicity, this algorithm has resisted optimal analysis outside of the rank-one case. Based on joint work with Huang, Tropp, and Ward.


Monday, April 12, 2021, Hideyuki Ishi (Osaka City University):

- Speaker: Hideyuki Ishi (Osaka City University)
- Title: Gaussian Group Convex Models
- Abstract: A Gaussian group model is a statistical model consisting of centered multivariate Gaussian distributions whose covariance matrices are of the form g g', where g is an element of a matrix group G. Such a statistical model is recently studied by C. Améndola, K. Kohn, P. Reichenbach, and A. Seigal in terms of Geometric Invariant Theory. In this talk, we consider a Gaussian group model whose parameter set is convex in the vector space of symmetric matrices in a different way. Using the theory of affine homogeneous convex domains and left-symmetric algebras established by E. Vinberg, we shall describe the parameter set as a set of symmetric matrices with specific block decompositions. Then we give the smallest number of samples such that the maximum likelihood estimator exists with probability one. Furthermore, the MLE is unique if it exists, and is expressed explicitly as a rational function of samples.

Ishi.pdf

Monday, March 29, 2021, Jane Coons (NCSU):

- Speaker: Jane Coons (NCSU)
- Title: Quasi-Independence Models with Rational Maximum Likelihood Estimator
- Abstract: Let X and Y be random variables. Quasi-independence models are log-linear models that describe a situation in which some states of X and Y cannot occur together, but X and Y are otherwise independent. We characterize which quasi-independence models have rational maximum likelihood estimator, or MLE, based on combinatorial features of the bipartite graph associated to the model. In this case, we give an explicit formula for the maximum likelihood estimate. We also show that if a log-linear model has rational MLE, then so do all of its facial submodels.

Coons.pdf

Monday, March 15, 2021: Simon Telen (MPI-MiS Leipzig)

- Speaker: Simon Telen (MPI-MiS Leipzig)
- Title: Likelihood Equations and Scattering Amplitudes
- Abstract: We identify the scattering equations from particle physics as the likelihood equations for a particular statistical model. The scattering potential plays the role of the log-likelihood function. We employ recent methods from numerical algebraic geometry for solving rational function equations to compute and certify all critical points, and show that the same methods can be used to study the ML degree of low rank tensor models. We revisit physical theories proposed by Arkani-Hamed, Cachazo and their collaborators, introducing positive statistical models and their string amplitudes. Joint work with Bernd Sturmfels.


Telen.pdf

Monday, March 1, 2021: Yihong Wu (Yale University)

- Speaker: Yihong Wu (Yale University)
- Title: Self-regularizing Property of Nonparametric Maximum Likelihood Estimator in Mixture Models
-
Abstract: Introduced by Kiefer and Wolfowitz 1956, the nonparametric maximum likelihood estimator (NPMLE) is a widely used methodology for learning mixture models and empirical Bayes estimation. Sidestepping the non-convexity in mixture likelihood, the NPMLE estimates the mixing distribution by maximizing the total likelihood over the space of probability measures, which can be viewed as an extreme form of overparameterization. In this work we discover a surprising property of the NPMLE solution. Consider, for example, a Gaussian mixture model on the real line with a subgaussian mixing distribution. Leveraging complex-analytic techniques, we show that with high probability the NPMLE based on a sample of size n has O(\log n) atoms (mass points), significantly improving the deterministic upper bound of n due to Lindsay 1983. Notably, any such Gaussian mixture is statistically indistinguishable from a finite one with O(\log n) components (and this is tight for certain mixtures). Thus, absent any explicit form of model selection, NPMLE automatically chooses the right model complexity, a property we term self-regularization. Statistical applications and extensions to other exponential families will be given. Time permitting, we will discuss some recent results on optimal regret in empirical Bayes and the role of NPMLE. This is based on joint work with Yury Polyanskiy (MIT)

Wu.pdf

Monday, February 15, 2021: Fatemeh Mohammadi (Ghent University)

- Speaker: Fatemeh Mohammadi (Ghent University)
- Title: Geometry of Conditional Independence Models with Hidden Variables
-
Abstract: Conditional independence (CI) is an important tool in statistical modeling, as, for example, it gives a statistical interpretation to graphical models. In general, given a list of dependencies among random variables, it is difficult to say which constraints are implied by them. Moreover, it is important to know what constraints on the random variables are caused by hidden variables. On the other hand, the CI statements are corresponding to some determinantal conditions on the tensor of joint probabilities of the observed random variables. Hence, the geometric analogue of the inference question relates to determinantal varieties and their irreducible decompositions. I will demonstrate how the decompositions of CI varieties lead to interesting algebraic and combinatorial questions about point configurations in matroid theory and incidence geometry. This, in particular, leads to effective computational approaches for decomposing more general determinantal varieties.

Mohammadi.pdf

Monday, February 1, 2021: Wenxuan Guo (U Chicago)

  • - Speaker: Wenxuan Guo (U Chicago)
    - Title: Shepp p-product
    - Abstract: In 1962, Shepp famously discovered a product of normal random variables that preserves normality. The Shepp product, which takes the form XY/(X^2 + Y^2)^1/2, has since been thoroughly studied and has found numerous connections to other areas of statistics. Among other things, it has an extension to n normal variables, gives a multiplicative analogue of central limit theorem, and applies unexpectedly to genomics as a test statistics for alignment-free sequence analysis. The Shepp product is evidently the p = 2 special case of XY/(X^p + Y^p)^1/p that we call the Shepp p-product. We will show that the Shepp p-product, particularly when p = 1 and ∞ (the latter in a limiting sense), is no less fascinating and applicable than the original p = 2 case. Just as the Shepp 2-product preserves normal distributions, the Shepp 1-product preserves Cauchy distributions while the Shepp ∞-product preserves exponential distributions. In fact, the converse is also true in an appropriate sense, allowing us to characterize the Cauchy, normal, and exponential distributions as the unique distributions preserved by the Shepp p-product for p = 1, 2, ∞ respectively. We will study the multiplicative analogue of infinite divisibility with respect to the Shepp p-product, establish an asymptotic theory for the Shepp p-product of n i.i.d. random variables, and estimate the rates of convergence in Kolmogorov distance. Alongside our study of convergence rates, we define the domain of normal attraction of extremal distributions and establish a new rate of uniform convergence to Frechet distribution and reverse Weibull distribution. Some of our results are new even for the p = 2 case. We will also discuss new applications of the Shepp p-product in statistics, computational biology, and statistical physics. This is joint work with Lek-Heng Lim.


Guo.pdf

Monday, January 18, 2021: Eliana Duarte (OvGU Magdeburg)

  • - Speaker: Eliana Duarte (OvGU Magdeburg)
    - Title: Algebraic Geometry of Discrete Interventional Models
    -
    Abstract: The Markov equivalence class of a discrete DAG model can be described parametrically via the recursive factorization property or implicitly by polynomial ideals which are defined via Markov properties of the DAG. We address the problem of describing the Markov equivalence classes of discrete DAG models with interventions using polynomial parameterizations and vanishing ideals. We show that the algebraic and combinatorial properties of these models are captured via an interventional staged tree model representation. This point of view leads us to a graphical characterization of the discrete interventional DAG models that are defined by binomial equations. This is joint work with Liam Solus (KTH, Sweden).

Duarte.pdf

Friday, December 4, 2020: Jose Rodriguez (University of Wisconsin - Madison)

  • - Speaker: Jose Rodriguez (University of Wisconsin - Madison)
    - Title: Galois Groups in Statistics
    -
    Abstract: Solving systems of polynomial equations is at the center of applied algebraic geometry. A common theme of this field is to study a family of systems by allowing some of its coefficients to vary. In algebraic statistics, the role of coefficients is played by data and the solutions we find yield maximum likelihood estimates, critical points, and/or important information about a statistical model. An important invariant of a family of systems is the Galois (monodromy) group. This captures important symmetries within a system and has applications across kinematics, computer vision, power engineering and statistics. In each case, the Galois group gives a description of how a system's solutions can vary with data.

In this talk, I will present three short stories about Galois groups appearing in statistics. The first story emphasizes the idea of treating data as coefficients of a polynomial system. We will visualize the monodromy group acting in a nearest point problem where Euclidean distance (ED) degrees make an appearance. The next story involves Gaussian mixtures and decomposable systems. If time permits, I will share a third story on how decomposable sparse systems play a role in solving the likelihood equations.

Rodriguez.pdf

Friday, November 20, 2020: Kaie Kubjas (Aalto)

  • - Speaker: Kaie Kubjas (Aalto University)
    - Title: Uniqueness of nonnegative matrix factorizations
    -
    Abstract: Nonnegative matrix factorizations are often encountered in data mining applications where they are used to explain datasets by a small number of parts. For many of these applications it is desirable that there exists a unique nonnegative matrix factorization up to trivial modifications given by scalings and permutations. This means that model parameters are uniquely identifiable from the data. Different sufficient conditions for the uniqueness of nonnegative matrix factorizations have been well studied, however, a little is known about necessary conditions. We will give so far the strongest necessary condition for the uniqueness of a nonnegative factorization. The talk is based on the joint work with Robert Krone.


Kubjas.pdf

Friday, November 6, 2020: Zehua Lai (U Chicago)

  • - Speaker: Zehua Lai (University of Chicago)
    - Title: Recht–Re Noncommutative Arithmetic-Geometric Mean Conjecture is False
    -
    Abstract: Stochastic optimization algorithms have become indispensable in modern machine learning. An important question in this area is the difference between with-replacement sampling and without-replacement sampling --- does the latter have superior convergence rate compared to the former? A paper of Recht and Re reduces the problem to a noncommutative analogue of the arithmetic-geometric mean inequality where n positive numbers are replaced by n positive definite matrices. If this inequality holds for all n, then without-replacement sampling (also known as random reshuffling) indeed outperforms with-replacement sampling in some important optimization problems. In this talk, We will explain basic ideas and techniques in polynomial optimization and the theory of noncommutative Positivstellensatz, which allows us to reduce the conjectured inequality to a semidefinite program and the validity of the conjecture to certain bounds for the optimum values. Finally, we show that Recht--Re conjecture is false as soon as n=5. We will also discuss some of the consequences of our main theorem. This is a joint work with Lek-Heng Lim.


Lai.pdf

Thursday, October 22, 2020: Rina Foygel Barber (U Chicago)

  • - Speaker: Rina Foygel Barber (University of Chicago)
    - Title: Testing goodness-of-fit and conditional independence with approximate co-sufficient sampling
    -
    Abstract: Goodness-of-fit (GoF) testing is ubiquitous in statistics, with direct ties to model selection, confidence interval construction, conditional independence testing, and multiple testing, just to name a few applications. While testing the GoF of a simple (point) null hypothesis provides an analyst great flexibility in the choice of test statistic while still ensuring validity, most GoF tests for composite null hypotheses are far more constrained, as the test statistic must have a tractable distribution over the entire null model space. A notable exception is co-sufficient sampling (CSS): resampling the data conditional on a sufficient statistic for the null model guarantees valid GoF testing using any test statistic the analyst chooses. But CSS testing requires the null model to have a compact (in an information-theoretic sense) sufficient statistic, which only holds for a very limited class of models; even for a null model as simple as logistic regression, CSS testing is powerless. In this paper, we leverage the concept of approximate sufficiency to generalize CSS testing to essentially any parametric model with an asymptotically-efficient estimator; we call our extension “approximate CSS” (aCSS) testing. We quantify the finite-sample Type I error inflation of aCSS testing and show that it is vanishing under standard maximum likelihood asymptotics, for any choice of test statistic. We apply our proposed procedure both theoretically and in simulation to a number of models of interest to demonstrate its finite-sample Type I error and power. This work is joint with Lucas Janson.


Foygel.pdf

Friday, October 9, 2020: Serkan Hoşten (SFSU)

  • - Speaker: Serkan Hosten (San Francisco State University)
    - Title: Two themes on (Gram) spectrahedra: central curves and symmetry
    -
    Abstract: This talk is based on two collaborations: one with Alex Heaton and Isabelle Shankar on symmetry adapted Gram spectrahedra, and the other with Isabelle Shankar and Angelica Torres on the degree of the central curve in semidefinite programming (SDP). The objects in common are spectrahedra. The question for the degree of the central curve in SDP (where feasible regions are spectrahedra) has its answer in algebraic statistics as the ML degree of related linear concentration models and the relevant geometry of complete quadrics. On the symmetry adapted Gram spectrahedra side, we use reductions in complexity of Gram spectrahedra for symmetric polynomials to understand the geometry of these convex sets. Here I will focus on concrete families and examples.

Hosten.pdf

Friday, September 25, 2020: Thomas Kahle (OvGU Magdeburg)

  • - Speaker: Thomas Kahle (OvGU Magdeburg)
    - Title: Central Limit Theorems for Permutation Statistics
    -
    Abstract: The number of inversions or descents of a random permutation in a large symmetric group is asymptotically normally distributed. We discuss extensions of this principle to arbitrary families of finite Coxeter groups of increasing rank. As a prerequisite we find uniform formulas for the means and variances in terms of Coxeter group data. The main gadget for central limit theorems is the Lindeberg-Feller theorem for triangular arrays. Transferring the Lindeberg condition to the combinatorial setting, one finds that the validity of a central limit theorem depends on the growth of the dihedral subgroups in the sequence.


Kahle.pdf

Friday, September 11, 2020: Joe Kileel (UT Austin)

  • - Speaker: Joe Kileel (UT Austin)
    - Title: Fast Symmetric Tensor Decomposition
    -
    Abstract: Symmetric tensor decomposition arises when applying the method of moments for statistical parameter estimation. In this talk, we will present a new method for low-rank decomposition, building on Sylvester’s catalecticant method and the higher-order power method. The approach achieves a speed-up over the state-of-the-art by roughly one order of magnitude. We will also describe an “implicit” variant of the algorithm for the case of moment tensors, which avoids the explicit formation of higher-order moments. The ideas are quite flexible. We will sketch how to adapt them for generalized principal component analysis (learning subspace arrangements from samples). Time permitting, we will also indicate an exciting connection to scientific imaging, specifically single-particle reconstruction in cryo-electron microscopy.


Kileel.pdf

Friday, August 28, 2020: Seth Sullivant (NCSU)

  • - Speaker: Seth Sullivant (North Carolina State University)
    - Title: Identifiability in Phylogenetics using Algebraic Matroids
    -
    Abstract: Identifiability is a crucial property for a statistical model since distributions in the model uniquely determine the parameters that produce them. In phylogenetics, the identifiability of the tree parameter is of particular interest since it means that phylogenetic models can be used to infer evolutionary histories from data. In this paper we introduce a new computational strategy for proving the identifiability of discrete parameters in algebraic statistical models that uses algebraic matroids naturally associated to the models. We then use this algorithm to prove that the tree parameters are generically identifiable for 2-tree CFN and K3P mixtures. We also show that the k-cycle phylogenetic network parameter is identifiable under the K2P and K3P models. This is joint work with Benjamin Hollering.


MatroidIdentifiability.pdf

Friday, July 17, 2020: Ngoc Tran (UT Austin)

  • - Speaker: Ngoc Tran (UT Austin)
    - Title: Graphical Models for Extreme Events with Tropical Algebra
    -
    Abstract: Motivated by extreme value theory, max-linear Bayesian networks have been recently introduced and studied as an alternative to linear structural equation models. However, for max-linear systems the classical independence results for Bayesian networks are far from exhausting valid conditional independence statements. We use tropical linear algebra to derive a compact representation of the conditional distribution given a partial observation, and exploit this to obtain a complete description of all conditional independence relations. Our analysis opens up several interesting questions concerning conditional independence and tropical geometry. Joint work with Carlos Améndola, Claudia Klüppelberg, and Steffen Lauritzen.

Slides-Ngoc.pdf

Friday, July 3, 2020: Anna Seigal (University of Oxford)

  • - Speaker: Anna Seigal (University of Oxford)
    - Title: Invariant Theory for Maximum Likelihood Estimation
    -
    Abstract: The task of fitting data to a model is fundamental in statistics; a widespread approach to this is via maximum likelihood estimation. In invariant theory, the capacity problem looks for a point of minimal norm in an orbit under a group action. We show that maximum likelihood estimation is equivalent to a capacity problem in two statistical settings: log-linear models and Gaussian transformation families. This enables us to characterise the existence of a maximum likelihood estimate using stability notions under a corresponding group action. We see how algorithms from statistics can be used in invariant theory, and vice versa. This talk is based on joint work with Carlos Améndola, Kathlén Kohn, and Philipp Reichenbach.

AnnaSeigal.pdf

Friday, June 19, 2020: Bernd Sturmfels (MPI Leipzig/UC Berkeley)

  • - Speaker: Bernd Sturmfels (MPI Leipzig/UC Berkeley)
    - Title: Statistical Models with Rational Maximum Likelihood Estimator
    -
    Abstract: A statistical model for discrete data is a subset of a probability simplex. Its maximum likelihood estimator (MLE) is a retraction from the simplex to the model. We characterize all models for which the MLE is a rational function. This rests on real algebraic geometry results, mostly due to Huh and Kapranov. Our discussion centers around models used in practise, like Bayesian networks, graphical models, and staged trees. This is joint work with Eliana Duarte and Orlando Marigliano.

BerndSturmfels.pdf

Friday, June 5, 2020: Caroline Uhler (MIT/ETH)

  • - Speaker: Caroline Uhler (MIT/ETH)
    - Title: Permutations and Posets for Causal Structure Discovery
    -
    Abstract: Gene knockout experiments allow performing interventions in large-scale systems. This represents a unique opportunity for causal structure discovery, since it allows testing algorithms with real data and in relevant settings. We discuss the rich combinatorial, algebraic and geometric questions underlying causal structure discovery. In particular, we show that viewing causal structure discovery as an optimization problem over permutations (in the fully observed setting) or posets (in the presence of unobserved variables) can lead to algorithms with stronger consistency guarantees than previously known, which translates into better performance in terms of predicting the effect of a gene knockout experiment.

CarolineUhler.pdf