07/25/2016: HapCUT2 is now available from github. It implements a likelihood based approach and also estimates trans-error rates for Hi-C data. More details will be available soon.

11/11/13: new version of HapCUT (v.0.6) available for download (64-bit linux binary compatible with GLIBC 2.3 or greater). HapCUT can now generate haplotypes from fosmid pooled sequencing data and also from ligation based mate-pair sequencing data. 

Latest source code for HapCUT is available from github (https://github.com/vibansal/hapcut).

is a max-cut based algorithm for haplotype assembly using sequence reads from the two chromosomes of an individual. It can be applied to sequence data generated from next-generation sequencing platforms. HapCUT takes as input the aligned SAM/BAM files for an individual diploid genome and the list of variants (VCF file), and outputs the phased haplotype blocks that can be assembled from the sequence reads. 

The HAPCUT method is described here:HapCUT: an efficient and accurate algorithm for the haplotype assembly problem.  Bansal V, Bafna V. Bioinformatics. 24(16):i153-9. 2008 Aug 15. PMID: 18689818.

HapCUT has been applied to phase Craig Venter's genome, which was sequenced using Sanger sequencing technology. Phased haplotypes for JC Venter's genome  are available for download from here. Please see the HuRef ftp site for more information about the variants identified in this genome.

A sample dataset from NA18508 (bam file for a region on chromosome 20, VCF file and HapCUT output files) can be downloaded from the attachments (HAPCUT-testdata.tar.gz) and used to test HapCUT.

Vince Buffalo from UC Davis has a python package called readphaser to convert  the phased HapCUT output into FASTA files of phased/unphased reads. Users of HapCUT may find this useful.

Vikas Bansal,
Sep 18, 2013, 2:04 PM
Vikas Bansal,
Aug 6, 2012, 2:14 PM
Vikas Bansal,
Jan 28, 2014, 2:00 PM
Vikas Bansal,
Mar 26, 2014, 5:30 PM
Vikas Bansal,
Sep 21, 2011, 1:35 PM