Research

Recent Work

Consider coloring the integers {1, 2, ..., n} red and blue while trying to avoid 3 equally-spaced integers being colored the same.  Van der Waerden's Theorem is a classic result in Ramsey Theory stating that for a fixed k and large enough n, you cannot avoid coloring k equally-spaced dots the same.  But, how can you minimize the number of these monochromatic arithmetic progressions?

We prove a known 2-coloring of the integers [N]:={1,2,3,…,N} minimizes the number of monochromatic arithmetic 3-progressions under certain restrictions. A monochromatic arithmetic progression is a set of equally-spaced integers that are all the same color. Previous work by Parrilo, Robertson and Saracino conjectured an optimal coloring for large N that involves 12 colored blocks. Here, we prove that the conjecture is optimal among anti-symmetric colorings with 12 or fewer colored blocks. We leverage a connection to the coloring of the continuous interval [0,1] used by Parrilo, Robertson, and Saracino as well as by Butler, Costello and Graham, using BCG diagrams like the one on the left. Our proof identifies classes of colorings with permutations, then counts the permutations using mixed integer linear programming.

With the tools from Analytic Combinatorics in Several Variables, it is well-understood when multivariate arrays can be encoded in multivariate rational generating functions (GFs), and how to extract asymptotics from these generating functions.  However, not much is known about how to extract asymptotics for more general GFs.  Algebraic GFs are one step broader than rational GFs, and they encode arrays defined by recursive structures described by context-free languages, such as binary trees and constrained (random) walks; RNA secondary structures; applications of the kernel method, the quadratic method for maps, and generalizations. 


We present a strategy for computing asymptotics of coefficients of d-variate algebraic generating functions. Using known constructions, we embed the coefficient array into an array represented by a rational generating functions in d+1 variables, and then apply ACSV theory to analyse the latter. This method allows us to give systematic results in the multivariate case, seems more promising than trying to derive analogs of the rational ACSV theory for algebraic GFs, and gives the prospect of further improvements as embedding methods are studied in more detail.

The Euclidean first-passage percolation model of Howard and Newman is a rotationally invariant percolation model built on a Poisson point process. It is known that the passage time between 0 and ne_1 obeys a diffusive upper bound: Var T(0, ne_1) <= Cn, and in this paper we improve this inequality to Cn/log n.


The methods follow the strategy used for sublinear variance proofs on the lattice, using the Falik-Samorodnitsky inequality and a Bernoulli encoding, but with substantial technical difficulties. To deal with the different setup of the Euclidean model, we represent the passage time as a function of Bernoulli sequences and uniform sequences, and develop several "greedy lattice animal" arguments.

A growing number of RNA sequences are now known to have distributions of multiple stable sequences. Recent algorithms use the list of nucleotides in a sequence and auxiliary experimental data to predict such distributions. Although the algorithms are largely successful in identifying a distribution's constituent structures, it remains challenging to recover their relative weightings. In this paper, we quantify this issue using a total variation distance. Then, we prove under a Nussinov-Jacobson model that a large proportion of RNA structure pairs cannot be jointly reconstructed with low total variation distance. Finally, we characterize the uncertainty in predicting conformational ratios by analyzing the amount of information in the auxiliary data.