HipMer and MetaHipMer
Extreme Scale De Novo Genome and MetaGenome Assembler
HipMer, is the first high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. HipMer’s high performance is based on several novel algorithmic advancements attained by leveraging the efficiency and programmability of the one-sided communication capabilities of the Unified Parallel C (UPC), including optimized high-frequency k-mer analysis, communication avoiding de Bruijn graph traversal, advanced I/O optimization, and extensive parallelization across the numerous and complex application phases.
MetaHipMer is the latest version of HipMer that also provides high quality assembly of metagenomes. Additionally as of version 1.0, MetaHipMer is optimized for single genomes, with distinct workflows geared to haploid and diploid assembly of complex large genomes.
Primary authors are Evangelos Georganas, Aydin Buluc, Steven Hofmeyr, Leonid Oliker, Eugene Goltsman and Rob Egan, with direction and advice from Kathy Yelick. The original Meraculous was developed by Jarrod Chapman, Isaac Ho, Eugene Goltsman, and Daniel Rokhsar.
The latest release of HipMer (version 1.0 released Dec 2018) is now available. This version is capable of assembly of metagenomes as well as single haploid and diploid genomes, depending on the runtime configuration. Download from Sourceforge or apply to have your data run for free at NERSC through the HipMer Portal.
This web portal offers the opportunity for anyone to submit sequence data for assembly using HipMer on NERSC supercomputers, for free. In general, it is possible to assemble very large genomes and metagenomes, with terabytes of sequence data.
For more information about HipMer, contact Steven Hofmeyr
Distributed-Memory Protein Clustering using High-Performance MCL
HipMCL is a high-performance parallel algorithm for large-scale network clustering. HipMCL parallelizes popular Markov Cluster (MCL) algorithm that has been shown to be one of the most successful and widely used algorithms for network clustering. It is based on random walks and was initially designed to detect families in protein-protein interaction networks. Despite MCL’s efficiency and multi-threading support, scalability remains a bottleneck as it fails to process networks of several hundred million nodes and billion edges in an affordable running time. HipMCL overcomes all of these challenges by developing massively-parallel algorithms for all components of MCL. HipMCL can be x1000 times faster than the original MCL without any information loss. It can easily cluster a network of ~75 million nodes with ~68 billion edges in ~2.4 hours using ~2000 nodes of Cori supercomputer at NERSC. HipMCL is developed in C++ language and uses standard OpenMP and MPI libraries for shared- and distributed-memory parallelization.
Primary authors are Ariful Azad and Aydin Buluc, in collaboration with Georgios Pavlopoulos (JGI), Nikos Kyrpides (JGI) and Christos Ouzounis (CERTH).
The first release of HipMCL (1.0.0) is now available. Download from Bitbucket
For more information about HipMCL, contact Ariful Azad
Microbenchmarks for Measuring Asynchronous Collective Communication Performance
MerBench is a set of microbenchmarks originally developed for analyzing the performance of the primary communication patterns implemented in HipMer, an extreme-scale de novo genome assembler. One of the keys to HipMer’s high performance is attained by leveraging one-sided communication capabilities of the Unified Parallel C (UPC) for asynchronous Alltoall and Alltoallv communication. These benchmarks are a distillation of these essential communication patterns and parameters (e.g. message size) for cross-architecture and cross-application network performance analysis.
The primary authors are Evangelos Georganas, Rob Egan, and Marquita Ellis. Evangelos Georganas developed the original version of the microbenchmarks for analyzing HipMer. Rob Egan contributed a number of extensions for usability and cross-platform portability.
For more information about MerBench, contact Marquita.