Research

My research lies in the interplay of algebraic geometry, combinatorics and representation stability with statistics and data science*. It concerns discovering algebraic and geometric structures of statistical models, and using the found structures to get insights on the model. More specifically, with collaborators, I:

investigate whether the algebraic structure of a statistical model is toric. This includes finding Markov bases and Gröbner bases for the associated ideal and providing tools for detecting toricness -- recently we suggest symmetry Lie algebras of varieties.
use tools from algebraic geometry to advance maximum likelihood (ML) estimation problems. This involves optimization and computational algebraic geometry, algebraic degree, Gröbner basis theory, intersection theory, toric fiber products, matroid theory, localization and primary decomposition. On occasion, this involves the polyhedral geometry of the associated polytopes.
investigate asymptotic behaviors of infinitely many ideals, algebras, polytopes, or statistical models, related through a group action. In this work, we connect questions about them with formal language and automata theory from computer science. Most recently, we introduce the Segre and tensor product of formal languages and use it to advance problems in asymptotic algebra.

In my work both discrete and multivariate Gaussian models are studied. This includes hierarchical models, staged tree models, no-three-way interaction models, phylogenetic models, and (colored) Gaussian graphical models. Each research project involves software Julia, Macaulay2 or SageMath. Projects in 1 and 2 are supported by an NSF standard grant.

*The survey Nonlinear Algebra and Applications that I have coauthored includes an overview of applications of algebraic geometry to statistics.

Here are two recent recorded presentations related to work on algebraic approaches to Brownian and Gaussian graphical models, and a presentation on toric geometry via Lie theory.

Publications

the number to the right of the title is the preprint year. All papers are on arxiv and google scholar.

Halfspace Representations of Path Polytopes of Trees (2025)

joint work with Amer Goel and Alvaro Ribot
arxiv: 2502.21204
Subjects: Combinatorics (math.CO); Statistics Theory (math.ST)

Given a tree T its path polytope is the convex hull of the edge indicator vectors for the paths between any two distinct leaves in T.These polytopes arise naturally in polyhedral geometry and applications, such as phylogenetics, tropical geometry, and algebraic statistics. We provide a minimal halfspace representation of these polytopes. The construction is made inductively using toric fiber products.

Toric Multivariate Gaussian Models from Symmetries in a Tree (2024)

joint work with Emma Cardwell and Alvaro Ribot
to appear in Advances in Applied Mathematics (arxiv: 2412.00895)
Subjects: Algebraic Geometry (math.AG); Combinatorics (math.CO); Statistics Theory (math.ST); Populations and Evolution (q-bio.PE)

Given a rooted tree $\T$ on $n$ non-root leaves with colored and zeroed nodes, we construct a linear space $L_\T$ of $n\times n$ symmetric matrices with constraints determined by the combinatorics of the tree. When $L_\T$ represents the covariance matrices of a Gaussian model, it provides natural generalizations of Brownian motion tree (BMT) models in phylogenetics. When $L_\T$ represents a space of concentration matrices of a Gaussian model, it gives certain colored Gaussian graphical models, which we refer to as BMT derived models. We investigate conditions under which the reciprocal variety $L_\T^{-1}$ is toric. Relying on the birational isomorphism of the inverse matrix map, we show that if the BMT derived graph of $\T$ is vertex-regular and a block graph, under the derived Laplacian transformation, $L_\T^{-1}$ is the vanishing locus of a toric ideal. This ideal is given by the sum of the toric ideal of the Gaussian graphical model on the block graph, the toric ideal of the original BMT model, and binomial linear conditions coming from vertex-regularity. To this end, we provide monomial parametrizations for these toric models realized through paths among leaves in $\T$.

Contrastive Independent Component Analysis for Salient Patterns and Dimensionality Reduction (2024)

joint work with Kexin Wang and Anna Seigal
Proceedings of National Academy of Science (arxiv: 2407.02357)
Subjects: Statistics Theory (math.ST); Algebraic Geometry (math.AG); Machine Learning (stat.ML)

Visualizing data and finding patterns in data are ubiquitous problems in the sciences. Increasingly, applications seek signal and structure in a contrastive setting: a foreground dataset relative to a background dataset. For this purpose, we propose contrastive independent component analysis (cICA). This generalizes independent component analysis to independent latent variables across a foreground and background. We propose a hierarchical tensor decomposition algorithm for cICA. We study the identifiability of cICA and demonstrate its performance visualizing data and finding patterns in data, using synthetic and real-world datasets, comparing the approach to existing contrastive methods.

ML Degrees of Brownian Motion Tree Models: Star Trees and Root Invariance (2024)

joint work with Jane Ivy Coons, Shelby Cox, and Ikenna Nometa
Journal of Symbolic Computation (arxiv: 2402.10322)
Subjects: Statistics Theory (math.ST); Algebraic Geometry (math.AG); Populations and Evolution (q-bio.PE)

A Brownian motion tree (BMT) model is a Gaussian model whose associated set of covariance matrices is linearly constrained according to common ancestry in a phylogenetic tree. We study the complexity of inferring the maximum likelihood (ML) estimator for a BMT model by computing its ML-degree. Our main result is that the ML-degree of the BMT model on a star tree with n+1 leaves is 2^{n+1}-2n-3, which was previously conjectured by Améndola and Zwiernik. We also prove that the ML-degree of a BMT model is independent of the choice of the root. The proofs rely on the toric geometry of concentration matrices in a BMT model. Toward this end, we produce a combinatorial formula for the determinant of the concentration matrix of a BMT model, which generalizes the Cayley-Prüfer theorem to complete graphs with weights given by a tree.

Talk__IMSI_Brownian_AidaMaraj (3).pdf

Lawrence Lifts, Matroids, and Maximum Likelihood Degrees (2023)

joint work with Taylor Brysiewicz
Algebraic Statistics (arxiv: 2310.13064)
Subjects: Combinatorics (math.CO); Algebraic Geometry (math.AG); Statistics Theory (math.ST)

We express the maximum likelihood (ML) degrees of a family toric varieties in terms of Mobius invariants of matroids. The family of interest are those parametrized by monomial maps given by Lawrence lifts of totally unimodular matrices with even circuits. Specifying these matrices to be vertex-edge incidence matrices of bipartite graphs gives the ML degrees of some hierarchical models and three dimensional quasi-independence models. Included in this list are the no-three-way interaction models with one binary random variable, for which, we give closed formulae.

Symmetry Lie Algebras of Varieties with Applications to Algebraic Statistics (2023)

joint work with Arpan Pal
to appear in SIAM Applied Algebra and Geometry (arxiv: 2309.10741)
Subjects: Algebraic Geometry (math.AG); Statistics Theory (math.ST)
SageMath Code

The motivation for this paper is to detect when an irreducible projective variety V is not toric. We do this by analyzing a Lie group and a Lie algebra associated to V. If the dimension of V is strictly less than the dimension of the above mentioned objects, then V is not a toric variety. We provide an algorithm to compute the Lie algebra of an irreducible variety and use it to provide examples of non-toric statistical models in algebraic statistics.

AMS_Talk_Howard (2).pdf

Shift Invariant Algebras, Segre Products and Regular Languages (2022)

joint work with Uwe Nagel
Journal of Algebra (arxiv: 2204.07849)
Subjects: Commutative Algebra (math.AC); Formal Languages and Automata Theory (cs.FL)
SageMath Code

Motivated by results on the rationality of equivariant Hilbert series of some hierarchical models in algebraic statistics we introduce the Segre product of formal languages and apply it to establish rationality of equivariant Hilbert series in new cases. To this end we show that the Segre product of two regular languages is again regular. We also prove that every filtration of algebras given as a tensor product of families of algebras with rational equivariant Hilbert series has a rational equivariant Hilbert series. The term equivariant is used broadly to include the action of the monoid of nonnegative integers by shifting variables. Furthermore, we exhibit a filtration of shift invariant monomial algebras that has a rational equivariant Hilbert series, but whose presentation ideals do not stabilize.

Talk_SegreLanguages.pdf

Symmetrically Colored Gaussian Graphical Models with Toric Vanishing Ideals (2021)

joint work with Jane Ivy Coons, Pratik Misra, and Miruna-Stefana Sorea
SIAM Journal of Applied Algebra and Geometry (arxiv: 2111.14817)
Subjects: Combinatorics (math.CO); Algebraic Geometry (math.AG); Statistics Theory (math.ST)

A colored Gaussian graphical model is a linear concentration model in which equalities among the concentrations are specified by a coloring of an underlying graph. The model is called RCOP if this coloring is given by the edge and vertex orbits of a subgroup of the automorphism group of the graph. We show that RCOP Gaussian graphical models on block graphs are toric in the space of covariance matrices and we describe Markov bases for them. To this end, we learn more about the combinatorial structure of these models and their connection with Jordan algebras.

texasTalk.pdf

Staged Tree Models with Toric Structure (2021)

joint work with Christiane Görgen and Lisa Nicklasson
Journal of Symbolic Computation (arxiv: 2107.04516)
Subjects: Commutative Algebra (math.AC); Statistics Theory (math.ST)
Macaulay2 and Mathematica Code (MathRepo)

A staged tree model is a discrete statistical model encoding relationships between events. These models are realised by directed trees with coloured vertices. In algebro-geometric terms, the model consists of points inside a toric variety. For certain trees, called balanced, the model is in fact the intersection of the toric variety and the probability simplex. This gives the model a straightforward description, and has computational advantages. In this paper we show that the class of staged tree models with a toric structure extends far outside of the balanced case, if we allow a change of coordinates. It is an open problem whether all staged tree models have toric structure.

Talk_AlgStat_Hawaii.pdf

Nonlinear Algebra and Applications (2021)

joint work with Paul Breiding, Türkü Özlüm Çelik, Timothy Duff, Alexander Heaton, Anna-Laura Sattelberger, Lorenzo Venturello, and Oğuzhan Yürük
Numerical Algebra, Control and Optimization (arxiv: 2103.16300)
Subjects: Algebraic Geometry (math.AG)

We showcase applications of nonlinear algebra in the sciences and engineering. Our survey is organized into eight themes: polynomial optimization, partial differential equations, algebraic statistics, integrable systems, configuration spaces of frameworks, biochemical reaction networks, algebraic vision, and tensor decompositions. Conversely, developments on these topics inspire new questions and algorithms for algebraic geometry.

AStat why relevant?.pdf

Reciprocal Maximum Likelihood Degrees of Brownian Motion Tree Models (2020)

joint work with Tobias Boege, Jane Ivy Coons, Christopher Eur, and Frank Röttger
Le Matematiche (arXiv: 2009.11849)
Subjects: Statistics Theory (math.ST); Algebraic Geometry (math.AG)
Video talk at the One World Webinar organized by YoungStatS
Video talk at the Workshop on AlgStats for Biological and Ecological Systems organized by IMSI.

We give an explicit formula for the reciprocal maximum likelihood degree of Brownian motion tree models. To achieve this, we connect them to certain toric (or log-linear) models, and express the Brownian motion tree model of an arbitrary tree as a toric fiber product of star tree models.

YoungStats-2.pdf

Generalized Cut Polytopes for Binary Hierarchical Models (2020)

joint work with Jane Ivy Coons, Joseph Cummings and Ben Hollering
to appear in the Algebraic Statistics (arXiv: 2008.00043)
Subjects: Combinatorics (math.CO); Statistics Theory (math.ST)
SageMath Code

Marginal polytopes are important geometric objects that arise in statistics as the polytopes underlying hierarchical log-linear models. These polytopes can be used to answer geometric questions about these models, such as determining the existence of maximum likelihood estimates or the normality of the associated semigroup. Cut polytopes of graphs have been useful in analyzing binary marginal polytopes in the case where the simplicial complex underlying the hierarchical model is a graph. We introduce a generalized cut polytope that is isomorphic to the binary marginal polytope of an arbitrary simplicial. This polytope is full dimensional in its ambient space and has a natural switching operation among its facets that can be used to deduce symmetries between the facets of the correlation and binary marginal polytopes. We use this switching operation along with Bernstein and Sullivant's characterization of unimodular simplicial complexes to find a complete H-representation for many unimodular simplicial complexes.

Ph.D Thesis: Algebraic and Geometric Properties of Hierarchical Models (2020)

https://doi.org/10.13023/etd.2020.232

In this dissertation filtrations of ideals arising from hierarchical models in statistics related by a group action are studied. These filtrations lead to ideals in polynomial rings in infinitely many variables, which require innovative tools. Regular languages and finite automata are used to prove and explicitly compute the rationality of some multivariate power series that record important quantitative information about the ideals. Some work regarding Markov bases for non-reducible models is shown, together with advances in the polyhedral geometry of binary hierarchical models.

Hierarchical_Models.pdf

Equivariant Hilbert Series for Hierarchical Models (2019)

joint work with Uwe Nagel
Algebraic Statistics, Vol. 12 (2021), No. 1, 21–42 (arXiv: 1909.13026)
Subjects: Commutative Algebra (math.AC); Formal Languages and Automata Theory (cs.FL); Statistics Theory (math.ST)
Macaulay2 Code.
Video Talk at Nonlinear Algebra Seminar Online--MPI MiS

Toric ideals to hierarchical models are invariant under the action of a product of symmetric groups. Taking the number of factors, say m, into account we introduce and study invariant filtrations and their equivariant Hilbert series. We present a condition that guarantees that the equivariant Hilbert series is a rational function in m + 1 variables with rational coefficients. Furthermore we give explicit formulas for the rational functions with coefficients in a number field and an algorithm for determining the rational functions with rational coefficients. A key is to construct finite automata that recognize languages corresponding to invariant filtrations.

Slides__Equivariant_Hilbert_Series.pdf

Page updated

Google Sites

Report abuse