My current research interests lie in the application of stochastic processes in modeling the evolution of species. In particular, I am interested in understanding the various factors that drive speciation and extinction events on species. Recently, I become deeply fascinated in how environmental and geographical changes affect species evolution.
During my undergraduate years, I studied various subject in Mathematics and Statistics, ranging from mathematical analysis, applied mathematics, to statistical methods. My Bachelor's thesis was under the supervision of Ariyanto, M.Si. and (alm) Dr. Jafaruddin Hamid with a research in point-set topology. We studied various characteristics and relationships from the separation axioms of topological spaces. Both my studies and research were supported by the Academic Achievement scholarships from the Indonesian Ministry of Research, Technology, and Higher Education, and the Van Deventer-Maas scholarship from the Van-Deventer-Maas Indonesia.
During my postgraduate studies as a Master's student at the Australian National University, I took several courses in Bioinformatics and Mathematical Population Genetics. I wrote a Master's thesis under the guidance of Assoc. Prof. Conrad Burden. For the thesis, we investigated the elapsed time since the most recent common ancestor of a finite random sample drawn from a present-day population which has evolved according to a Bienaymé-Galton-Watson branching process. Both research and studies for the degree were supported by the Indonesian Ministry of Finance under the LPDP scholarship.
I began working with mathematical models for the evolution of species in 2019 as part of my PhD degree under Prof. Barbara Holland and Assoc. Prof. Małgorzata O'Reilly. In my PhD, I developed mathematical models for species evolution. Specifically, we developed a model where events on species trees depend on their species age. Also, we developed a trait-dependent model where speciation and extinction events are driven by species' inherited traits. Last but not least, we developed an environmental-dependent model where geography and change in environments drive species evolution. In summary, my PhD research combine stochastic modeling to study evolutionary processes.
RECORDED TALKS
3rd Joint Congress on Evolutionary Biology (Virtual) "A diffusion-based approach for simulating forward-in-time state-dependent speciation and extinction dynamics" (June 27th, 2024)
Living Earth Collaborative "Mathematical perspectives on phylogenetic models of lineage diversification" (March 7th, 2024)
11th International Conference on Matrix-Analytic Method "Stochastic niche-based models for the evolution of species" (June 29 - July 1, 2022)
INFERRING EPIDEMIOLOGICAL PARAMETERS USING PHYLOGEOGRAPHY MODEL OF INFECTIOUS DISEASE TRANSMISSION WITH HOST MOVEMENTS IN SPATIALLY STRUCTURED POPULATIONS
During an outbreak, infectious disease can spread into previously unaffected populations or locations through host movement, potentially resulting in local outbreaks with their own epidemiological dynamics. However, it is often unclear whether new infections between populations are caused by infectious travelers entering a foreign population infecting residents, or that foreign residents infect healthy travelers when abroad before returning home with disease. In this paper, we introduce a phylogeographic model of pathogen spread through visitor dynamics, whereby hosts "visit'' other populations for short trips before returning home. To do so, we exploited the stationary properties of an epidemiological compartment model with visitor dynamics to derive an approximating model that is computationally tractable for phylogenetic modeling. We applied our model to empirical data from the SARS-CoV-2 pandemic in Europe. Estimates of host movement–related parameter values under our visitor model suggest that migration models, with trips of indefinite length, may underestimate the magnitude of outbreaks caused by visitors. This study emphasizes the importance of carefully incorporating individual movement dynamics into such models.
LINKING DATA GENERATING PROCESS AND EVOLUTIONARY PATTERNS UNDER STATE-DEPENDENT SPECIATION AND EXTINCTION MODELS
State-dependent speciation and extinction (SSE) models are stochastic branching processes with state-dependent birth (speciation) and death (extinction) rates. The states can either be discrete or continuous, and can represent various things from phenotypic traits to geographical ranges. By modelling anagenesis and cladogenesis explicitly, these models help researchers to understand their data by asking whether traits play a role in shaping species diversity through time. However, over the past years, these models entered a phase of intense but overdue scrutiny to better understand what the models can and cannot estimate reliable when fitted to real biological dataset. This works helps to fill in the demand of a more rigorous study on mathematical properties underlying these complex stochastic processes. Here, we establish a general diffusion-based framework (often used in the field of population genetics) to study this wide and highly influential class of phylogenetic diversification models. We derive explicitly relationships between the data-generating process and their expected evolutionary patterns through direct comparisons (via simulation and theory) with a tree-based approach. Our work helps to formalize relationships between evolutionary state patterns, process rates, and mixing times for ClaSSE-type models (SMB article highlight).
PHYLOGENETIC DEEP LEARNING FOR STUDYING BIOGEOGRAPHIC DIVERSIFICATION WITH SPECIES INTERACTIONS
In recent years, as models get more complex to accommodate biological realism, traditional inference methods via maximum-likelihood or Bayesian frameworks become obsolete. These methods require an explicit likelihood function of observing a tree for their inference studies. However, deriving a likelihood function is not always trivial - this is particularly true for some more complex models. In fact, current SSE models that have explicit likelihood functions require numerical approximations to compute their likelihood. Supervised learning offers a way out of this complicated problem by learning directly from patterns derived from data generated under a simulation model. In phylogenetics, recent studies have shown that phylogenetic inference under supervised learning agrees with inference under the traditional inference methods (paper).
In this work, we attempt to use a deep learning approach for learning patterns generated a biogeographic diversification model where we account of local species density. By species density, we refer to the number of different species co-inhabit same place at the same time. Firstly, we will develop a general mathematical framework under these SSE models that can account for biological factor (species interaction and local equilibrium capacity) and other possible non-biological factors. These factors drive speciation and extinction rates that would hopefully be captured in evolutionary patterns of species diversity. In order to use phylogenetic deep learning, we need to able to generate many training data under the model. Thus, a fast tree simulator is required.
MATRIX-ANALYTIC METHODS FOR MODELLING LINEAGE DIVERSIFICATION WITH STATE-DEPENDENT RATES
This work provides a bridge between phylogenetic models for lineage diversification and an area of stochastic modelling known as matrix-analytic methods (MAMs). We utilized an important class of MAMs known in literature as Markovian Binary Tree to derive various mathematical properties from trees that follow state-dependent evolutionary processes. Our work has a nice result that shows that most SSE models are special cases of an MBT. We also develop an explicit mathematical expression to compute likelihood of a reconstructed tree evolving under an SSE model in MBT notations. This likelihood function allows researchers to do parameter inference under the model given biological (or simulated) data. Using MBT, we can compute tree balance metric of a phylogenetic tree directly from the model parameters without using any statistical analyses that people would normally do (preprint).
AGE-DEPENDENT PHYLOGENETIC MODELS OF LINEAGE DIVERSIFICATION
Phylogenetic models of lineage diversification follow a birth-and-death process in which a birth event is associated with a speciation event (lineage splits into two distinct lineages) and a death event is associated with an extinction event (lineage terminates). We model these birth and death events by drawing waiting times from some probability distributions (e.g. exponential distribution).
PH distributions are a class of distributions in the theory of MAMs that generalize both Erlang and hyperexponential distributions. Under this distribution, we can think of an individual species as progressing through different phases during its lifetime with some rates until it either undergoes a speciation or extinction event. Thus, the distribution provides a natural path for linking these speciation and extinction events with species age. By adjusting its underlying parameters, we can test biological hypothesis of whether species age plays a role in increasing or decreasing diversification rates through time. Given an empirical data, we found that our model under a pure-birth process (assume no extinction) fits better compared to other models that follow usual distributions such as exponential and Weibull (journal article).
STUDY ON MOST RECENT COMMON ANCESTOR OF A RANDOMLY SAMPLED POPULATION UNDER A BIENAYMÉ - GALTON - WATSON BRANCHING PROCESS
We consider the problem of estimating the elapsed time since the most recent common ancestor of a finite random sample drawn from a population which has evolved through a Bienaymé–Galton– Watson branching process. More specifically, we are interested in the diffusion limit appropriate to a supercritical process in the near-critical limit evolving over a large number of time steps. Our approach differs from earlier analyses in that we assume the only known information is the mean and variance of the number of offspring per parent, the observed total population size at the time of sampling, and the size of the sample. We obtain a formula for the probability that a finite random sample of the population is descended from a single ancestor in the initial population, and derive a confidence interval for the initial population size in terms of the final population size and the time since initiating the process. We also determine a joint likelihood surface from which confidence regions can be determined for simultaneously estimating two parameters, (1) the population size at the time of the most recent common ancestor, and (2) the time elapsed since the existence of the most recent common ancestor (paper).