11/11/13: new version of HapCUT (v.0.6) available for download (64-bit linux binary compatible with GLIBC 2.3 or greater). HapCUT can now generate haplotypes from fosmid pooled sequencing data and also from ligation based mate-pair sequencing data. If you need to compile HapCUT for other platforms, please send an email to vbansal AT scripps.edu
HapCUT is a max-cut based algorithm for haplotype assembly using sequence reads from the two chromosomes of an individual. It can be applied to sequence data generated from next-generation sequencing platforms. HapCUT takes as input the aligned SAM/BAM files for an individual diploid genome and the list of variants (VCF file), and outputs the phased haplotype blocks that can be assembled from the sequence reads.
The HAPCUT method is described here:HapCUT: an efficient and accurate algorithm for the haplotype assembly problem. Bansal V, Bafna V. Bioinformatics. 24(16):i153-9. 2008 Aug 15. PMID: 18689818.
HapCUT has been applied to phase Craig Venter's genome, which was sequenced using Sanger sequencing technology. Phased haplotypes for JC Venter's genome are available for download from here. Please see the HuRef ftp site for more information about the variants identified in this genome.
A sample dataset from NA18508 (bam file for a region on chromosome 20, VCF file and HapCUT output files) can be downloaded from the attachments (HAPCUT-testdata.tar.gz) and used to test HapCUT.