Parallel Algorithms
Memory-Hierarchy Aware Algorithms
Large‐Scale Data Systems
Algorithm Engineering
Computational Genomics
Parallel and memory-hierarchy aware scalable algorithms and data structures for large‑scale data systems
Scaling Parallel Algorithms to Massive Datasets using Multi-SSD Machines [SPAA 2025]
Haohong Li, Jamshed Khan, and Laxman Dhulipala
It is now possible to build relatively inexpensive multicore servers equipped with dozens of terabytes of local NVMe SSD storage. We describe our first results with designing parallel algorithms for such machines that achieve nearly in-memory performance while gracefully scaling to datasets that are much larger than the main memory. We microbenchmark our multi-SSD machine, measure the performance of different I/O patterns, and describe scaling bottlenecks for multi-SSD machines and how to overcome them by leveraging the modern asynchronous I/O stack. We then describe scalable I/O primitives for the multi-SSD setting that can be used to concisely express several I/O-efficient parallel algorithms, including samplesort, random shuffle, and basic sequence primitives. Our experimental results reveal that there can be very little overhead to well-designed multi-SSD algorithms relative to their in-memory counterparts. [SPAA]
Fast and Scalable Parallel External-Memory Construction of Colored Compacted de Bruijn Graphs with Cuttlefish 3 [pre-print]
Jamshed Khan, Laxman Dhulipala, and Rob Patro
We introduce Cuttlefish 3, a state-of-the-art parallel, external-memory algorithm for constructing (colored) compacted de Bruijn graphs. It introduces novel algorithmic improvements for the graph construction and a sparsification method to vastly reduce the amount of data required to compute the colored graph, surpassing existing approaches in speed and scalability in both colored and uncolored scenarios. [bioRxiv]
Cuttlefish 3 is available open-source in GitHub.
Alevin-fry-atac Enables Rapid and Memory Frugal Mapping of Single-Cell ATAC-seq Data Using Virtual Colors for Accurate Genomic Pseudoalignment [ISMB/ECCB 2025]
Noor Pratap Singh, Jamshed Khan, and Rob Patro
We propose an updated pseudo-alignment scheme, in the tool Alevin-fry-atac, that splits genomic sequences (colors) into sub-segments (virtual colors), treated as distinct colors. While mapping single-cell ATAC-seq reads in parallel, Alevin-fry-atac is 1.78x faster than the next-best method, using 3x less memory. [Bioinformatics]
Alevin-fry-atac is available open-source in GitHub.
Fast, Parallel, and Cache‐friendly Suffix Array Construction [WABI 2023, AMB (2024)]
Jamshed Khan, Tobias Rubel, Erin Molloy, Laxman Dhulipala, and Rob Patro
We present CaPS‐SA, a simple and scalable parallel algorithm for constructing the suffix array and the LCP array. It has excellent memory‐locality and achieves strong performance on modern multicore systems. We show that CaPS‐SA outperforms existing state‐of‐the‐art parallel suffix array and LCP‐array construction algorithms. [WABI, AMB]
CaPS‐SA is available open-source in GitHub.
Fulgor: A Fast and Compact k‐mer Index for Large‐Scale Matching and Color Queries [WABI 2023, AMB (2024)]
Jason Fan, Jamshed Khan, Noor Pratap Singh, Giulio Eramnno Pibiri, and Rob Patro
We present Fulgor, a fast and compact colored de Bruijn graph data structure—built with a dictionary and a compressed inverted index—to determine the subset of references from a given collection that are likely to contain a query nucleotide sequence. We demonstrate that compared to a state‐of‐the‐art index, Fulgor indexes take 2–3.8x less space, is at least 2x fast for queries, and is 2–6x faster to construct. [WABI, AMB]
Fulgor is available under an open-source license in GitHub.
Spectrum Preserving Tilings Enable Sparse and Modular Reference Indexing [RECOMB 2023]
Jason Fan, Jamshed Khan, Giulio Ermanno Pibiri, and Rob Patro
We introduce the spectrum preserving tiling (SPT), a general representation of genomic sequence collections that specifies how a set of tiles repeatedly occur to spell out the indexed sequences. We introduce a class of sampling schemes for SPTs that trade off speed to reduce the size of the tile-to-reference mapping, and implement a practical index with these sampling schemes in the tool pufferfish2. [LNCS]
Pufferfish2 is available open-source in GitHub.
Scalable, Ultra-Fast, and Low-Memory Construction of Compacted de Bruijn Graphs with Cuttlefish 2 [Genome Biology (2022)]
Jamshed Khan, Marek Kokot, Sebastian Deorowicz, and Rob Patro
We present a fast and memory-frugal algorithm for constructing compacted de Bruijn graphs, Cuttlefish 2, applicable both on raw sequencing short-reads and assembled references, that can scale to very large datasets. It builds upon the novel idea of modeling de Bruijn graph vertices as DFA from Cuttlefish—modifying the model and generalizing the algorithm, so as to facilitate all applicable forms of input. [Genome Biology]
Cuttlefish 2 is available open-source in GitHub, and also through conda.
An Incrementally Updatable and Scalable System for Large‐Scale Sequence Search using the Bentley‐Saxe Transformation [Bioinformatics (2022)]
Fatemeh Almodaresi, Jamshed Khan, Sergey Madaminov, Michael Ferdman, Rob Johnson, Prashant Pandey, and Rob Patro
We show how to build a scalable and updatable exact RNA sequence‐search index. Specifically, we extend the Mantis index using the Bentley‐Saxe transformation to support efficient updates. We demonstrate Mantis’ scalability by constructing an index of ≈40K samples from the SRA (Sequence Read Archive) by adding samples one at a time to an initial index of 10K samples. [Bioinformatics]
Dynamic Mantis is available open-source in GitHub.
Cuttlefish: Fast, Parallel, and Low‐Memory Compaction of de Bruijn Graphs from Large‐Scale Genome Collections [ISMB/ECCB 2021]
Jamshed Khan and Rob Patro [Outstanding Student Paper Award]
We introduce a new algorithm, Cuttlefish, to construct the (colored) compacted de Bruijn graph from a collection of one or more genome references. Cuttlefish introduces a novel approach of modeling de Bruijn graph vertices as finite‐state automata, and constrains these automata’s state‐space to enable tracking their transitioning states with very low memory usage. It is also fast and highly parallelizable. [Bioinformatics]
Cuttlefish is available open-source in GitHub, and also through conda.