research

current research

Learning of general Metric Spaces and dynamical Fermat distances

In this project, we extend the scope of sample Fermat distances (e.g., Groisman et al., 2022; Hwang et al., 2016) in two significant directions. First, we move beyond the traditional setting of manifolds to accommodate more general metric spaces, with the ultimate goal of including discrete spaces. Second, we introduce a time component into the definition of sample Fermat distances, establishing a connection with dynamical first passage percolation. This contrasts with the time-homogeneous first passage percolation commonly discussed in the current literature. This extended framework has applications in machine learning, particularly in clustering and topological data analysis, enabling more robust analysis of diverse and dynamic datasets.

non-parametric bayesian statistics

Using tools from mathematical population genetics and self-similar Markov processes—such as duality and random time changes—reveals surprising connections with non-parametric Bayesian statistics and offers new methods for analyzing statistical models.

Self-Similarity: A New Perspective in Mathematical Population Genetics

We propose replacing the conventional branching property in random population models with the self-similarity property. By leveraging self-similarity techniques, we improve upon key results in the literature, such as the renowned work of Birkner et al. (2005), paving the way for re-examining this field from a fresh perspective. Unlike branching models, which assume independence among individuals, self-similarity allows for the study of complex reproductive dynamics, particularly in scenarios where populations face restrictive resources, making it a more realistic approach.

Furthermore, the study of our newly introduced self-similar measure-valued processes offers an ideal framework for advancing the theory of self-similar Markov processes in infinite dimensions. This is achieved through tools such as duality and particle representations, which are commonly used in mathematical population genetics.

preprints and published work

phase-type distributions and inhomogeneous coalescent processes

https://doi.org/10.1101/2024.09.19.613917

Joint work with Arno Siri-Jégousse (IIMAS - UNAM), Lizbeth Peñaloza (UMAR), and Matthias Stenruecken (Univ. of Chicago)

In this project, we leveraged the simplicity and effectiveness of phase-type distributions to investigate the time to the most recent common ancestor (TMRCA) in population models. In particular, we characterized the density of the TMRCA for populations whose total size evolves deterministically over time. We also demonstrate that the TMRCA has significant potential for distinguishing between competing evolutionary models in practical applications, and that our explicit formula for the density can be effectively used in inference schemes based on maximum likelihood estimation.

Self-similarity: a new perspective in population genetics

https://arxiv.org/abs/2405.10193

Joint work with Arno Siri-Jégousse (IIMAS - UNAM)

We introduce a new class of measure-valued self-similar Markov processes. By extending the well-known Lamperti transformation for self-similar Markov processes to the infinite-dimensional setting, we generalize the celebrated work of Birkner et al. (2005) in mathematical population genetics. They characterize the frequency of types process of stable branching populations in terms of Beta Fleming-Viot processes. We construct a larger class of self-similar populations whose frequency of types is described by a general Lambda Fleming-Viot process.

Our results only scratch the surface of the potential power of the interplay between population genetics and the theory of self-similar Markov processes. This project constitutes a first yet important step in the development of this new research program.

branching random walks with selection

https://arxiv.org/abs/2301.07762

Joint work with Emmanuel Shcertzer (Univ. of Vienna)

In this project, we investigated a stochastic model of a biological population under the influence of natural selection. Our findings revealed two intriguing phenomena in this model: 1) contrary to common understanding, increasing the strength of selection can increase the genetic variability within the population, and 2) we identified a novel phase transition in traveling waves, which we term the shift from pulled to semi-pulled waves.

A broader class of neUtral population models

https://link.springer.com/article/10.1007/s00285-024-02173-x

Joint work with Arno Siri-Jégousse (IIMAS - UNAM)

In this project we examined the genealogies of a broad class of neutral population models that includes the well-known Cannings' models. Departing from the standard assumption of symmetry in offspring distributions in Cannings' models, we introduced a less restrictive condition of non-heritability of reproductive success. This adjustment provides a more precise mathematical framework for studying neutral biological populations. Additionally, our framework enabled us to analyze the genealogy of a new exponential model. Despite its built-in fitness inheritance mechanism, this model fits within our neutrality setting.

Site frequency spectrum of the bolthausen-sznitman coalescent

https://alea.impa.br/articles/v18/18-53.pdf

Joint work with Arno Siri-Jégousse (IIMAS - UNAM) and Götz Kersting (Goethe Univ.)

In this project we characterized the Site Frequency Spectrum (SFS) of the Bolthausen-Sznitman coalescent. The SFS is a statistic based on the genetic diversity present in a biological population that is frequently used in Population Genetics for the inference of its evolutionary past. The Bolthausen-Sznitman coalescent is a stochastic model for the genealogy of a population that has been recently proposed as a new null model for populations that are under selective pressure.

Metassembler: merging and optimizing de novo genome assemblies

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0764-4

https://sourceforge.net/projects/metassembler/

Joint work with Michael Schatz (Johns Hopkins University, CSHL)

In this project, we designed and implemented a software package for the "de novo" assembly of genomes. The main heuristic of this software is to combine the assemblies produced using multiple (possibly different) algorithms into a single superior sequence by identifying and merging the best sequence stretches from each of them.

De novo assemblies of rice genomes

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0506-z

Large collaborative effort together with Michael Schatz (Johns Hopkins University, CSHL)

The aim of this project was to characterize the genome of novel strains of rice. I contributed to the bioinformatics component by using computational tools to identify and compare the coding regions of the newly assembled sequences.

human genomics: barcoding each nucleotide

https://www.pnas.org/doi/full/10.1073/pnas.1112567108

Large collaborative effort together with Rafael Palacios (UNAM)

The aim of this project was to barcode each nucleotide in the human genome based on its genomic context. Our findings revealed that a genomic context of 50 nucleotides is sufficient to uniquely identify 92% of the nucleotides in the human genome. My role in this project involved contributing to discussions that determined the direction of our research.

Page updated

Report abuse