HAPLORE: A Program for Haplotype Reconstruction in General Pedigrees without Recombination
Kui Zhang, Fengzhu Sun, Hongyu Zhao
Programs
Our algorithms have been implemented in a program by C++, here are the pre-compiled executable files:
The executable file for Linux operating system.
The executable file for Unix operating system.
The executable file for Windows operating system (Windows 95,95, NT, 2000,XP).
The manual (PDF format) for how to use this program can also be downloaded.
Examples
HAPLORE can handle genotype data from pedigrees as well as from unrelated individuals. We used the following data set with different options to test our program. You can also use it to explore it. In the following, we list the test data set, the command lines we used in the test and the corresponding output files. For detailed format of the input file and the options supported by HAPLORE, please refer to the manual (PDF format) of the program.
A genotype file - Test-Data-Set.dat, which contains the genotype data for 42 individuals at 10 marker loci. These individuals include 38 from 3 families and 4 unrelated individuals.
HAPLORE can infer haplotypes from pedigrees using a set of logic rules under the assumption of no recombinants. After this step, the partial or complete haplotypes of each individuals are determined and listed.
The command line: haplore_v242_win -iTest-Data_Set.dat -oTest-Output-001.dat.
The output file Test-Output-001.dat.
HAPLORE can list all compatible haplotype configurations without recombinants using the haplotype elimination algorithm. After this step, all compatible haplotype pairs for each individual are determined and listed.
The command line: haplore_v242_win -iTest-Data_Set.dat -oTest-Output-002.dat -h1.
The output file Test-Output-002.dat.
HAPLORE can try to find a minimum number haplotypes that can resolve all individuals. The program will use all haplotypes from those individuals who have a determined haplotype pair. The program often fails if there are not enough number of determined haplotypes. If the program does not fail, the program will output all compatible haplotype pairs for each individual.
The command line: haplore_v242_win -iTest-Data_Set.dat -oTest-Output-003.dat -m1.
HAPLORE can estimate haplotype frequencies and list all compatible haplotype configurations without recombinants with their posterior probabilities using the EM algorithm and the partition-ligation technique. The program provides several options to control the process. After this step, all compatible haplotype configurations with their posterior probabilities are listed.
The command line: haplore_v242_win -iTest-Data_Set.dat -oTest-Output-004.dat -e1.
The output file Test-Output-004.dat.
The command line: haplore_v242_win -iTest-Data_Set.dat -oTest-Output-005.dat -e1 -l1.
The output file Test-Output-005.dat.
Program History
April 20, 2006: a new routine is added to find the haplotype configuration with the maximum posterior probability among all compatible haplotype configurations.
April 12, 2006: the program is modified to allow the output of all compatible haplotype configurations with their corresponding posterior probabilities and the single haplotype configurations with the maximum posterior probability.
April 10, 2006: new routines are added for the likelihood function calculation. The likelihood can be calculated based on the Elston-Stewart algorithm and the array transformation technique, which potentially allow the program to handle larger pedigrees.
December 1, 2005: the program is modified to handle data with alleles that are not coded by consecutive integers.
January 1, 2005: the paper is published and the program is formally released.
May 15, 2002: the program is created and tested.
References
Kui Zhang, Fengzhu Sun, Hongyu Zhao. 2005. HAPLORE: A Program for Haplotype Reconstruction in General Pedigrees without Recombination. Bioinformatics 21: 90-103.
Kui Zhang, Hongyu Zhao. 2006. A Comparison of Several Methods for Haplotype Frequency Estimation and Haplotype Reconstruction for Tightly Linked Markers from General Pedigrees. Genetic Epidemiology 30: 423-437.
We are planning to update our program regularly. You are welcome to suggest features that you want us to implement into this program. We greatly appreciate if you could point out any bugs when you use our program. Our contact information is:
Kui Zhang, Ph.D.
Department of Mathematical Sciences
Michigan Technological University
1400 Townsend Drive
Houghton, Michigan 49931
Phone: 906-487-2918
Fax: 906-487-3133
Email: kuiz@mtu.edu
or
Hongyu Zhao, Ph.D
Department of Epidemiology and Public Health
Yale University School of Medicine
60 College Street
New Haven, CT 06520-8034
(203) 785-6271 (phone)
(203) 785-6912 (fax)
E-Mail: hongyu.zhao@yale.edu
Created Date: March 20, 2004
Last Updated Date: Sep 17, 2015