Research

My research lies in the interplay of algebraic geometry, combinatorics and representation stability with statistics and data science*. It concerns discovering algebraic and geometric structures of statistical models, and using the found structures to get insights on the model. More specifically, with collaborators, I:

In my work both discrete and multivariate Gaussian models are studied. This includes hierarchical models, staged tree models, no-three-way interaction models, phylogenetic models, and (colored) Gaussian graphical models. Each research project involves software Julia, Macaulay2 or SageMath. Projects in 1 and 2 are supported by an NSF standard grant.

 *The survey Nonlinear Algebra and Applications that I have coauthored includes an overview of applications of algebraic geometry to statistics.

Here are two recent recorded presentations related to work on algebraic approaches to Brownian and Gaussian graphical models. 

Publications 

the number to the right of the title is the preprint year. All papers are on arxiv and google scholar

12. Contrastive Independent Component Analysis (2024)

Visualizing data and finding patterns in data are ubiquitous problems in the sciences. Increasingly, applications seek signal and structure in a contrastive setting: a foreground dataset relative to a background dataset. For this purpose, we propose contrastive independent component analysis (cICA). This generalizes independent component analysis to independent latent variables across a foreground and background. We propose a hierarchical tensor decomposition algorithm for cICA. We study the identifiability of cICA and demonstrate its performance visualizing data and finding patterns in data, using synthetic and real-world datasets, comparing the approach to existing contrastive methods.

11. ML Degrees of Brownian Motion Tree Models: Star Trees and Root Invariance (2024)

A Brownian motion tree (BMT) model is a Gaussian model whose associated set of covariance matrices is linearly constrained according to common ancestry in a phylogenetic tree. We study the complexity of inferring the maximum likelihood (ML) estimator for a BMT model by computing its ML-degree. Our main result is that the ML-degree of the BMT model on a star tree with n+1 leaves is 2^{n+1}-2n-3, which was previously conjectured by Améndola and Zwiernik. We also prove that the ML-degree of a BMT model is independent of the choice of the root. The proofs rely on the toric geometry of concentration matrices in a BMT model. Toward this end, we produce a combinatorial formula for the determinant of the concentration matrix of a BMT model, which generalizes the Cayley-Prüfer theorem to complete graphs with weights given by a tree.

Talk__IMSI_Brownian_AidaMaraj (3).pdf

10. Lawrence Lifts, Matroids, and Maximum Likelihood Degrees (2023)

We express the maximum likelihood (ML) degrees of a family toric varieties in terms of Mobius invariants of matroids. The family of interest are those parametrized by monomial maps given by Lawrence lifts of totally unimodular matrices with even circuits. Specifying these matrices to be vertex-edge incidence matrices of bipartite graphs gives the ML degrees of some hierarchical models and three dimensional quasi-independence models. Included in this list are the no-three-way interaction models with one binary random variable, for which, we give closed formulae.

9. Symmetry Lie Algebras of Varieties with Applications to Algebraic Statistics (2023)

The motivation for this paper is to detect when an irreducible projective variety V is not toric. We do this by analyzing a Lie group and a Lie algebra associated to V. If the dimension of V is strictly less than the dimension of the above mentioned objects, then V is not a toric variety. We provide an algorithm to compute the Lie algebra of an irreducible variety and use it to provide examples of non-toric statistical models in algebraic statistics.

AMS_Talk_Howard (2).pdf

8. Shift Invariant Algebras, Segre Products and Regular Languages (2022)

Motivated by results on the rationality of equivariant Hilbert series of some hierarchical models in algebraic statistics we introduce the Segre product of formal languages and apply it to establish rationality of equivariant Hilbert series in new cases. To this end we show that the Segre product of two regular languages is again regular. We also prove that every filtration of algebras given as a tensor product of families of algebras with rational equivariant Hilbert series has a rational equivariant Hilbert series. The term equivariant is used broadly to include the action of the monoid of nonnegative integers by shifting variables. Furthermore, we exhibit a filtration of shift invariant monomial algebras that has a rational equivariant Hilbert series, but whose presentation ideals do not stabilize.

Talk_SegreLanguages.pdf

7. Symmetrically Colored Gaussian Graphical Models with Toric Vanishing Ideals (2021)

A colored Gaussian graphical model is a linear concentration model in which equalities among the concentrations are specified by a coloring of an underlying graph. The model is called RCOP if this coloring is given by the edge and vertex orbits of a subgroup of the automorphism group of the graph. We show that RCOP Gaussian graphical models on block graphs are toric in the space of covariance matrices and we describe Markov bases for them. To this end, we learn more about the combinatorial structure of these models and their connection with Jordan algebras.

texasTalk.pdf

6. Staged Tree Models with Toric Structure (2021)

A staged tree model is a discrete statistical model encoding relationships between events. These models are realised by directed trees with coloured vertices. In algebro-geometric terms, the model consists of points inside a toric variety. For certain trees, called balanced, the model is in fact the intersection of the toric variety and the probability simplex. This gives the model a straightforward description, and has computational advantages. In this paper we show that the class of staged tree models with a toric structure extends far outside of the balanced case, if we allow a change of coordinates. It is an open problem whether all staged tree models have toric structure.

Talk_AlgStat_Hawaii.pdf

We showcase applications of nonlinear algebra in the sciences and engineering. Our survey is organized into eight themes: polynomial optimization, partial differential equations, algebraic statistics, integrable systems, configuration spaces of frameworks, biochemical reaction networks, algebraic vision, and tensor decompositions. Conversely, developments on these topics inspire new questions and algorithms for algebraic geometry.

AStat why relevant?.pdf

4. Reciprocal Maximum Likelihood Degrees of Brownian Motion Tree Models (2020)

We give an explicit formula for the reciprocal maximum likelihood degree of Brownian motion tree models. To achieve this, we connect them to certain toric (or log-linear) models, and express the Brownian motion tree model of an arbitrary tree as a toric fiber product of star tree models.

YoungStats-2.pdf

3. Generalized Cut Polytopes for Binary Hierarchical Models (2020)

Marginal polytopes are important geometric objects that arise in statistics as the polytopes underlying hierarchical log-linear models. These polytopes can be used to answer geometric questions about these models, such as determining the existence of maximum likelihood estimates or the normality of the associated semigroup. Cut polytopes of graphs have been useful in analyzing binary marginal polytopes in the case where the simplicial complex underlying the hierarchical model is a graph. We introduce a generalized cut polytope that is isomorphic to the binary marginal polytope of an arbitrary simplicial. This polytope is full dimensional in its ambient space and has a natural switching operation among its facets that can be used to deduce symmetries between the facets of the correlation and binary marginal polytopes. We use this switching operation along with Bernstein and Sullivant's characterization of unimodular simplicial complexes to find a complete H-representation for many unimodular simplicial complexes. 

2. Ph.D Thesis: Algebraic and Geometric Properties of Hierarchical Models (2020)

In this dissertation filtrations of ideals arising from hierarchical models in statistics related by a group action are studied. These filtrations lead to ideals in polynomial rings in infinitely many variables, which require innovative tools. Regular languages and finite automata are used to prove and explicitly compute the rationality of some multivariate power series that record important quantitative information about the ideals. Some work regarding Markov bases for non-reducible models is shown, together with advances in the polyhedral geometry of binary hierarchical models.

Hierarchical_Models.pdf

1. Equivariant Hilbert Series for Hierarchical Models (2019)

Toric ideals to hierarchical models are invariant under the action of a product of symmetric groups. Taking the number of factors, say m, into account we introduce and study invariant filtrations and their equivariant Hilbert series. We present a condition that guarantees that the equivariant Hilbert series is a rational function in m + 1 variables with rational coefficients. Furthermore we give explicit formulas for the rational functions with coefficients in a number field and an algorithm for determining the rational functions with rational coefficients. A key is to construct finite automata that recognize languages corresponding to invariant filtrations.

Slides__Equivariant_Hilbert_Series.pdf