Serkan Hoşten - Projects

Below is a summary of projects I wish to work on with interested graduate students in the near term. Masters students at SFSU will typically work on their theses for about 1.5 years on a portion of any of these projects. PhD students, working 3+ years, will bite a much bigger chunk. If you are an interested graduate student contact me in person or via email

Likelihood Geometry studies an optimization problem in statistics (maximum likelihood estimation) using tools from convex, combinatorial, and computational algebraic geometry. The maximum likelihood degree of a variety is the number of complex critical points of the likelihood function. You can find out about the fundamentals in here, here, here, and here.

The ML degree of a toric variety (discrete exponential model) is at most its degree. A toric variety twisted by generic weights achieves this upper bound. The locus of nongeneric weights that gives rise to toric models with lower ML degree is the principal A-determinant (see here). Based on the principal A-determinant, characterize the ML-degree stratification of all (smooth) toric surfaces (see this for a start); do the same for Segre products of projective spaces (the variety of rank one tensors or independence models), more generally, for decomposable graphical models (see here) as well as Veronese varieties. What has this stratification to do with strict linear precision ? Characterize all weights for the above families which bring the ML degree down to one (see this and this). Compute the ML degree stratification for the toric variety given by an associahedron; in particular, compute the minimum ML degree achievable. See Section 7.1 in here for a concrete conjecture and read this on scattering amplitudes and likelihood geometry for context.
Is there a dual maximum likelihood estimator, a type of M-estimator (see Section 3 in here), for toric varieties ? If yes, is there a dual ML degree? Take all questions that are asked (and maybe answered) for the ML degree of toric varieties, and ask and answer them in the case of the dual ML degree.
The set of m x n matrices of rank at most r is an algebraic variety defined by determinantal (polynomial) equations. The ML degree of such varieties has been studied very carefully here and here, though we still do not know the ML degree of these determinantal varieties in general. A (statistically) more interesting set consists of m x n matrices of nonnegative rank at most r. Such a matrix M is a matrix with nonnegative entries and has the form M = AB where A is m x r and B is r x n, both with nonnegative entries. The likelihood optimization problem over the latter set requires the understanding of the algebraic boundary of this set inside the variety of m x n matrices of rank at most r. These algebraic boundaries are unions of algebraic varieties of which we know very little. The analogous problem can be stated for tensors. For some initial work see this, this, and this. Start with small values of m, n, and r (or tensors of small size) and compute/characterize the components of the said algebraic boundaries. Compute the ML degree of these components.
An m-dimensional centered Gaussian distribution is characterized by its covariance matrix which is a m x m symmetric positive definite matrix. Algebraic Gaussian models are defined by polynomial equations in the entries of these covariance matrices. Again, the ML degree of such a model is the number of complex critical points of the Gaussian likelihood function over the model. For a good introduction see Section 2.1 of this. There has been amazing amount of work on the ML degree and the likelihood geometry of Gaussian models such as concentration models (see also the following important papers: 1, 2, 3). An outstanding open problem asks for the characterization of algebraic Gaussian models with ML degree one. This might be too hard. Then how about characterizing one-dimensional Gaussian models with ML degree one (see this for the discrete case) ?
Every maximum likelihood problem starts with data and a model X. For generic data, the number of complex critical points of the likelihood function is the ML degree mld(X). As data varies smoothly, these critical points move smoothly. If one takes a closed path in the data space and moves along this path from the starting point back to the same point, the critical points are permuted. These permutations make up the monodromy group which is a subgroup of the symmetric group on mld(X) elements. Computing/understanding monodromy groups is a crucial task in numerical algebraic geometry (see here). Compute monodromy groups in the likelihood geometry of algebraic statistical models. Characterize models where the monodromy group is not the full symmetric group. Find geometric reasons why this happens.

A spectrahedron is the intersection of the convex cone of m x m symmetric positive semidefinite (psd) matrices with an affine linear space in the vector space of all m x m symmetric matrices. Spectrahedra are generalizations of convex polyhedra (=intersection of the cone of nonnegative vectors in R^m with an affine linear space). They are feasible sets of optimization problems called semidefinite programs (SDPs). For a good introduction to spectrahedra and SDPs check out chapter 2 in this book.

Consider a Gaussian model (see above) with a unique maximum likelihood estimator, or more strongly, with ML degree one. For each point on the model, the set of all sample covariance matrices (=data) with maximum likelihood estimator equal to this point is the so called log-Voronoi spectrahedron of that point (see this recent paper). As the point varies, the log-Voronoi spectrahedron varies. How does the combinatorics/geometry of these spectrahedra change? It is still not very clear what we mean by the combinatorics of a spectrahedron. For some interesting ideas see this paper. Nevertheless, try this question for Gaussian models that are linear. How about for one-dimensional models ?
A polynomial with real coefficients is called a sum-of-squares (SOS) if it is the sum of squares of a finite collection of polynomials. A polynomial is SOS if and only if it is associated to at least one psd matrix, known as a Gram matrix. The set of all Gram matrices associated to an SOS polynomial is a spectrahedron, the so called Gram spectrahedron of the polynomial; see chapters 3 and 4 in here. The topic of SOS polynomials is a classic subject with Hilbert having contributed important results. At the same time, there has been a recent revival in relation to polynomial optimization and real algebraic geometry. There are open questions for even the most basic objects. For instance, what is the combinatorics/geometry of the Gram spectrahedra of SOS binary forms (=homogeneous polynomials in two variables) ? See this and this as a starter. A first step for a more general polynomial is this one about ternary quartics. The field is wide open otherwise. Some more progress could be achieved if we assume that the SOS polynomial in question is a symmetric polynomial (or invariant under a group action). For some ideas in this direction consult this and a recent SFSU masters thesis.
If a convex polytope is complicated, in particular, from the point of view of linear optimization, one could look for a less complicated spectrahedron that will project to the polytope. Such a spectrahedral lift is related to the positive semidefinite factorization of a m x n nonnegative matrix M (or the associated polytope). Such a factorization consists of k x k symmetric psd matrices A_1, .., A_m and B_1, ..., B_n so that M_ij = trace(A_iB_j). The smallest k for which a psd factorization exists is the psd rank of M (see here). We do not know much about the psd rank of particular classes of polytopes. For instance, what is the psd rank of p x q transportation polytopes ? For each p and q, there are only finitely many combinatorial types, but the psd rank is not necessarily constant in a combinatorial type. Also, given a nonnegative matrix M, one can study the space of all psd factorizations of M. This space is a semi-algebraic set and its boundary has been studied in small cases; see here. Can rigidity theory help in understanding the boundary of the space of psd factorizations as it did in the case of the space of all nonnegative factorizations (see nonnegative rank above) ?

Page updated

Google Sites

Report abuse