Post date: Sep 04, 2013 9:4:39 PM
I have begun work again to try combine Timema genome scaffolds into linkage groups. First I moved the draft genome assembly and the mapping family data to the dorc computer cluster (data/timema/draft_genome/ and data/timema/timema_mappingfams/). I have a series of scripts to generate parent genotpye mle's, subset these for those with high confidence (current results pr mle genotype = genotype >= 0.8), and combine these with offspring heterozygote probabilities. Here are the script calls I ran back on 18vi13:
perl ../scripts/getParentGentoypes.pl 114 115 fam25266x25267.vcf
## identified 17384 loci, 114 and 115 are the parent index numbers
perl ../scripts/thresholdParentGenotypes.pl 0.8 parentGenotypes.txt
perl../scripts/calculateOffspringGenotypeProbs.pl 114 115 sub_parentGenotypes.txt fam25266x25267.vcf
perl ../scripts/splitP0P1.pl offspringGenotypes.txt
This produced 9185 recombination informative sites for P0 (in P0_offspringGenotypes.txt) and 8199 for P1. I ran the same set of scripts for the other families: 25268x25269 = 8199 (1568 for p0, 1998 for p1); 25270x25271 = 175 (115 for p0 and 60 for p1).
Now I am using these data to estimate recombination rates with the recomb software. This is now on my computer and the dorc cluster. Right now I am running the recombination estimation analysis for fam25266x25267 on the dorc cluster (results in scratch). The outfiles give the scaffold number and position of each snp pair, followed by the Pr(data | Mendelian expectations), and the recombination rate from each Monte Carlo replicate.