Post date: Jun 03, 2020 3:37:52 PM
Phasing is the assignment of alleles to paternal of maternal chromosomes: which chromosome has which allele. If is easy if both chromosome have the same version of the gene (AA, aa). It gets tricky for heterozygotes (Aa). We detect a mutation, but on which chromosome is it?
This is important after recombination: crossing overs can make some segments of the chromosome to exchange place (ABC and abc become ABc and abC).
The result of phasing makes haplotypes: series of alleles assigned to each parental chromosomes.
In our case,
- we have no information about the chromosomes themselves: we have a "reference genome" that is made of a succession of contigs (parts of chromosomes).
- we have a triploid organism.
Thus making haplotypes is not possible yet, and not so relevant.
Furthermore, we are analyzing the data by looking at the probability for each allele to be homozygote or heterozygote. We can deal with the uncertainty (keep the probability between 0 and 1) or we can make it binary (0 or 1, plus NA for intermediate probability values). We basically are making haplotypes of 0s and 1s.
Example: individual A is 011100N1010
individual B is 00N00001000
individual C is 01N00100100