minimap2 molecule-to-molecule aligner
We can extract a wealth of information from single molecules, but this data is often stored or processed inconsistently, making it slow and complex to reconcile constituent information from concordant molecules. For instance, if you use one version of nanopore basecalling for detecting m6A and another for CpG, you end up with two large BAM files of single-molecule data. To accurately interpret this information, each read needs to be aligned back to itself— a process that typically requires hundreds of billions of operations per sequencing run.
To address this challenge, I developed minimap2_r2r, a fork of minimap2 designed for the rapid alignment of different "versions" of reads back onto themselves. Its functionality is straightforward: it ensures alignment only when the reference and query share the same molecular identifier.