Post date: Feb 04, 2017 7:43:13 PM
The key goal here it to compare the orientation and ordering of scaffolds on LG8 for the dark vs. green genome so that we can better understand the nature and boundaries of the green-dark inversion. I took a first shot at this using recombination from green vs. brown parents, but with everything aligned to the brown genome (details here). Now I plan to order and orientate in brown and green separately using the appropriate genome and recombination rate estimates, and the to compare the two based on the mugsy comparative alignment.
1. Linkage map for the green genome (I only really care about LG8, but plan to generate the whole thing).
I copied the offspring genotype files to /uufs/chpc.utah.edu/common/home/u6000989/data/timema/timema_mappingfams/mapdata/lm2map_green and renamed them with family ids. I then combined them with a perl script,
perl ../scripts/makeMultiFamLinkage.pl offspringGenotypesfamA.txt offspringGenotypesfamB.txt offspringGenotypesfamC.txt
This generates the combinedSnpList.txt file and the inffile lm_tcrist.txt which includes all of the individuals and 19,065 SNPs total.
SNPs were then filtered using lepmap2
java -cp ~/source/lepmap2/bin/ Filtering data=lm_tcrist.txt epsilon=0.01 dataTolerance=0.005 missingLimit=10 keepAlleles=1 > data_tcristlm.txt
***
Filtering markers based on segregation distortion
Removed 13107 markers from family famA
( from which 6656 due to missing parent genotypes, run ParentCall to fix the parents )
Removed 15096 markers from family famB
( from which 14059 due to missing parent genotypes, run ParentCall to fix the parents )
Removed 16954 markers from family famC
( from which 16017 due to missing parent genotypes, run ParentCall to fix the parents )
Maternally informative markers = 5640 (of 19065)
Paternally informative markers = 6098
Maternally or paternally informative markers = 11217
***
Remove
SNPs were then joined into LG using the same LOD value (4) and similar minimum size as with the dark morph genome (had to drop from 50 to 40).
java -cp ~/source/lepmap2/bin/ SeparateChromosomes data=data_tcristlm.txt lodLimit=4 sizeLimit=40 > lgs_tcrist.txt
java -cp ~/source/lepmap2/bin/ JoinSingles lgs_tcrist.txt data=data_tcristlm.txt lodLimit=3 lodDifference=2 > lgs_tcrist_js.txt
joined 676 single markers, non-conflicting LOD score > 3.490038550540774
Number of LGs = 12, markers in LGs = 8696, singles = 10369
Assign (or unassign) entire scaffolds to LGs; first ombine map and scaffold data.q
map<-read.table("lgs_tcrist_js.txt",header=FALSE)
snps<-read.table("combinedSnpList.txt",header=FALSE)
o<-cbind(map,snps)
colnames(o)<-c("flg","scaf","pos")
write.table(o,"tcrLgs.txt",row.names=FALSE,col.names=TRUE,quote=FALSE)
Run perl script
## key params ##
## decide on LG for scaf
## rules:
## $nsnp SNPs supporting LG with $prct of SNPs in agreement
## $min SNPs total
## $alt is maximum % for an alternative LG
$prct = 0.1;
$nsnp = 2;
$min = $nsnp;
$alt = 0.5;
#############
perl assignScafFilter.pl tcrLgs.txt
assigned LGs for 352 scaffolds (92.5% of SNPs), failed to assign LGs for 434 scaffolds
tail -n +2 mod_tcrLgs.txt | cut -f 1 -d " " > mod_tcrLgsSnp.txt
Run order chromosome, with run chromosome at a time. Repeat this three times, once with all three families, and once with family A only on each parent.
perl wrap_qsub_slurm_orderScafs.pl
cd /uufs/chpc.utah.edu/common/home/u6000989/data/timema/timema_mappingfams/mapdata/lm2map_green/
java -cp ~/source/lepmap2/bin/ OrderMarkers map=mod_tcrLgsSnp.txt data=data_tcristlm.txt minError=0.01 chromosome=12 numThreads=4 initRecombination=0.05 0.05 learnRecombinationParameters=1 1 > order_LG12tcr.txt
perl wrap_qsub_slurm_orderScafsBrown.pl
cd /uufs/chpc.utah.edu/common/home/u6000989/data/timema/timema_mappingfams/mapdata/lm2map_green/
java -cp ~/source/lepmap2/bin/ OrderMarkers map=mod_tcrLgsSnp.txt data=data_tcristlm.txt minError=0.01 chromosome=12 numThreads=4 informativeMask=2 families=famA initRecombination=0.05 0.05 learnRecombinationParameters=1 1 > order2_LGbrwn12tcr.txt
perl wrap_qsub_slurm_orderScafsGreen.pl
cd /uufs/chpc.utah.edu/common/home/u6000989/data/timema/timema_mappingfams/mapdata/lm2map_green/
java -cp ~/source/lepmap2/bin/ OrderMarkers map=mod_tcrLgsSnp.txt data=data_tcristlm.txt minError=0.01 chromosome=12 numThreads=4 informativeMask=1 families=famA initRecombination=0.05 0.05 learnRecombinationParameters=1 1 > order_LGgrn12tcr.txt