Post date: Sep 07, 2018 9:0:30 PM
I ran a comparative alignment of our current Dovetail (Chicago and Hi-C) L. melissa genome and our older L. melissa genome, which was used for the (unpublished) linkage map.
Initially, I tried to use mugsy for the comparative alignment, but nothing aligned. I suspect this has something to do with the fact that the actual sequence data that went into both is identical (thus causing some weird error). Thus, instead I used mummer directly (the basis for mugsy) to simply identify mums (maximal unique matches) between the genomes.
From, /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/dovetail_melissa_genome/mummer_aln, I ran:
~/source/mugsy_x86-64-v1r2.3/MUMmer3.20/mummer -mum -b /uufs/chpc.utah.edu/common/home/gompert-group1/data/lycaeides/dovetail_melissa_genome/download/HiC_HiCRise_GLtS4/melissa_blue_21Nov2017_GLtS4/mod_melissa_blue_21Nov2017_GLtS4.fasta /uufs/chpc.utah.edu/common/home/gompert-group1/data/lycaeides/melissa_genome/Lmelissa1FinalAssembly.fasta > mums.txt
This generates mums.txt, which contains the matches. For each scaffold in the old genome (>) it gives all of the matches (one per line) in the new genome, followed by the start position in each and the length of the match (last column) (see the manual here).
I then ran my own script, extractMatches.pl to match of new scaffolds and old LGs. Here is what I have so far:
NewScaf,LG,OneOfSam's
1628,1,*
11,2,*
1646,3,*
1648,4,*
1636,5,*
228,6,*
1642,7,*
1645,8,*
1641,9,*
1639,10,*
4,11,*
833,12,*
1632,13,*
1644,14?,*
1640,15,*
503,16,*
309,17,*
1647,18,*
1638,19,*
588,20?,*
1095(1633),21?,*=1095
1627,22,*
1631,Z,*