I am a Ph.D. student in Computer Science at Johns Hopkins University, advised by Professor Murat Kocaoglu. I transferred from Purdue ECE with my advisor in 2025. My research centers on causal inference and causal discovery.
Find me:
Root Cause Analysis of Failures in Microservices via
Bayesian Root Cause Discovery (ICML 2026 Spotlight)
Modern cloud systems rely on architectures with many interconnected microservices, which enable scalability and flexibility but make troubleshooting failures difficult. Identifying the root cause requires navigating complex dependencies, often beyond the capacity of domain experts. Causal models offer a principled approach to root cause analysis (RCA), but prior methods are typically sample inefficient, as they assume access to the full causal graph or require large numbers of post-failure interventions.
We introduce Bayesian Root Cause Discovery (BRCD), which leverages a partial causal structure (a CPDAG learned during the pre-failure period) and performs Bayesian inference without enumerating all DAGs from each interventional Markov equivalence class ($\mathcal{I}$-MEC) for each root cause candidate. Using a recent uniform DAG sampling framework \citep{wienobst2023polynomial}, BRCD provides the first statistical consistency guarantees for nonparametric RCA, with both identifiability and finite-sample posterior bounds under $\varepsilon$-vanishing approximation. \rev{Empirically, across synthetic benchmarks and three microservice systems (Online Boutique, Sockshop, Petshop), BRCD achieves state-of-the-art top-$l$ accuracy while remaining effective in low-failure-sample regimes and scaling to large graphs.
Towards Completeness in Causal Discovery from Soft Interventions with Known Targets (ICML 2026)
We study causal discovery from soft interventions in the presence of latent confounding. Beyond within-environment conditional independences, soft interventions induce cross-environment invariances that can be encoded using an augmented graph with intervention indicator nodes ($\iaug$). Taking its maximal ancestral graph (MAG) yields the $\imag$, which characterizes the interventional Markov equivalence class. Building on this framework, we show that the FCI-inspired learner (\ifci) by ~\citet{kocaoglu2019characterization} is sound but not complete: it may output circle endpoints that are nevertheless compelled by the interventional equivalence class. To exploit intervention-node semantics, we propose two complementary methods. First, we introduce an enumeration-based completion procedure that is sound and theoretically complete, but whose worst-case cost depends on the number of MAGs compatible with the partial graph learned by \ifci. Second, we derive a set of additional local orientation rules that provably tighten \ifci without increasing asymptotic complexity. Both methods refine prior outputs in the controlled soft-intervention setting with latent variables.
Characterization and Learning of Causal Graphs from Hard Interventions (NeurIPS 2025)
A fundamental challenge in the empirical sciences involves uncovering causal structure through observation and experimentation. Causal discovery entails linking the conditional independence (CI) invariances in observational data to their corresponding graphical constraints via d-separation. In this paper, we consider a general setting where we have access to data from multiple experimental distributions resulting from hard interventions, as well as potentially from an observational distribution. By comparing different interventional distributions, we propose a set of graphical constraints that are fundamentally linked to Pearl's do-calculus within the framework of hard interventions. These graphical constraints associate each graphical structure with a set of interventional distributions that are consistent with the rules of do-calculus. We characterize the interventional equivalence class of causal graphs with latent variables and introduce a graphical representation that can be used to determine whether two causal graphs are interventionally equivalent, i.e., whether they are associated with the same family of hard interventional distributions, where the elements of the family are indistinguishable using the invariances from do-calculus. We also propose a learning algorithm to integrate multiple datasets from hard interventions, introducing new orientation rules. The learning objective is a tuple of augmented graphs which entails a set of causal graphs. We also prove the soundness of the proposed algorithm.
Sample Efficient Bayesian Learning of Causal Graphs from Interventions (NeurIPS 2024)
Causal discovery is a fundamental problem with applications spanning various areas in science and engineering. It is well understood that solely using observational data, one can only orient the causal graph up to its Markov equivalence class, necessitating interventional data to learn the complete causal graph. Most works in the literature design causal discovery policies with perfect interventions, i.e., they have access to infinite interventional samples. This study considers a Bayesian approach for learning causal graphs with limited interventional samples, mirroring real-world scenarios where such samples are usually costly to obtain. By leveraging the recent result on uniform DAG sampling in polynomial time, we can efficiently enumerate all the cut configurations and their corresponding interventional distributions of a target set, and further track their posteriors. Given any number of interventional samples, our proposed algorithm randomly intervenes on a set of target vertices that cut all the edges in the graph and returns a causal graph according to the posterior of each target set. When the number of interventional samples is large enough, we show theoretically that our proposed algorithm will return the true causal graph with high probability. We compare our algorithm against various baseline methods on simulated datasets, demonstrating its superior accuracy measured by the structural Hamming distance between the learned DAG and the ground truth. Additionally, we present a case study showing how this algorithm could be modified to answer more general causal questions without learning the whole graph.
A common workflow for single-cell RNA-sequencing (sc-RNA-seq) data analysis is to orchestrate a three-step pipeline. First, conduct a dimension reduction of the input cell profile matrix; second, cluster the cells in the latent space; and third, extract the “gene panels” that distinguish a certain cluster from others. This workflow has the primary drawback that the three steps are performed independently, neglecting the dependencies among the steps and among the marker genes or gene panels. In our system, Kratos, we alter the threestep workflow to a two-step one, where we jointly optimize the first two steps and add the third (interpretability) step to form an integrated sc-RNA-seq analysis pipeline. We show that the more compact workflow of Kratos extracts marker genes that can better discriminate the target cluster, distilling underlying mechanisms guiding cluster membership. In doing so, Kratos is significantly better than the two SOTA baselines we compare against, specifically 5.62% superior to Global Counterfactual Explanation (GCE) [ICML20], and 3.31% better than Adversarial Clustering Explanation (ACE) [ICML-21], measured by the AUROC of a kernel-SVM classifier. We opensource our code and datasets here: https://github.com/icanfor ce/single-cell-genomics-kratos.
Sequencing technologies are prone to errors, making error correction (EC) necessary for downstream applications. EC tools need to be manually configured for optimal performance. We find that the optimal parameters (e.g., k-mer size) are both tool- and dataset-dependent. Moreover, evaluating the performance (i.e., Alignment-rate or Gain) of a given tool usually relies on a reference genome, but quality reference genomes are not always available. We introduce Lerna for the automated configuration of k-mer-based EC tools. Lerna first creates a language model (LM) of the uncorrected genomic reads, and then, based on this LM, calculates a metric called the perplexity metric to evaluate the corrected reads for different parameter choices. Next, it finds the one that produces the highest alignment rate without using a reference genome. The fundamental intuition of our approach is that the perplexity metric is inversely correlated with the quality of the assembly after error correction. Therefore, Lerna leverages the perplexity metric for automated tuning of k-mer sizes without needing a reference genome.