Post date: Nov 05, 2014 4:22:43 PM
I used nucmer to generate synteny maps for L. anna vs. L. melissa (old), L. anna vs. L. sierra, and L. melissa (old) vs. L. sierra. These are the key comparisons for the ancestry tract length project.
I then filtered all of the delta files to retain only one-to-one mappings:
~/Source/MUMmer3.23/delta-filter -1 mumLanna_Lmel1k.delta > one2one_mumLanna_Lmel1k.delta
~/Source/MUMmer3.23/delta-filter -1 mumLmel1_Lwarner1k.delta > one2one_mumLmel1_Lwarner1k.delta
~/Source/MUMmer3.23/delta-filter -1 mumLmel_Lsierra1k.delta > one2one_mumLmel_Lsierra1k.delta
~/Source/MUMmer3.23/delta-filter -1 mumLanna_Lsierra1k.delta > one2one_Lsierra1k.delta
~/Source/MUMmer3.23/delta-filter -1 mumLmelissa_Lwarner1k.delta > one2one_mumLanna_Lsierra1k.delta
~/Source/MUMmer3.23/delta-filter -1 mumLsierra_Lwarner1k.delta > one2one_mumLsierra_Lwarner1k.delta
~/Source/MUMmer3.23/delta-filter -1 mumLanna_Lwarner1k.delta > one2one_mumLanna_Lwarner1k.delta
~/Source/MUMmer3.23/delta-filter -1 mumLmel_Lmel1k.delta > one2one_mumLmel_Lmel1k.delta
~/Source/MUMmer3.23/delta-filter -1 mumLsierra_Lwarner.delta > one2one_mumLsierra_Lwarner.delta
Then I generated dot plot:
~/Source/MUMmer3.23/mummerplot one2one_mumLanna_Lmel1k.delta -R ../../melissa_genome/final.assembly.fasta -Q ../Lanna/DATA/RUN/ASSEMBLIES/assem18sept14/final.assembly.fasta --filter --layout -p plot_one2one_mumLanna_Lmel1k.delta
~/Source/MUMmer3.23/mummerplot one2one_mumLmel1_Lwarner1k.delta -R ../Lwarner/DATA/RUN/ASSEMBLIES/assem12oct14/final.assembly.fasta -Q ../../melissa_genome/final.assembly.fasta --filter --layout -p plot_one2one_mumLmel1_Lwarner1k.delta
~/Source/MUMmer3.23/mummerplot one2one_mumLmel_Lsierra1k.delta -R ../Lsierra/DATA/RUN/ASSEMBLIES/assem15sept14/final.assembly.fasta -Q ../../melissa_genome/final.assembly.fasta --filter --layout -p plot_one2one_mumLmel_Lsierra1k.delta
~/Source/MUMmer3.23/mummerplot one2one_mumLanna_Lsierra1k.delta -R ../Lsierra/DATA/RUN/ASSEMBLIES/assem15sept14/final.assembly.fasta -Q ../Lanna/DATA/RUN/ASSEMBLIES/assem18sept14/final.assembly.fasta --filter --layout -p plot_one2one_mumLanna_Lsierra1k.delta
~/Source/MUMmer3.23/mummerplot one2one_mumLanna_Lwarner1k.delta -R ../Lwarner/DATA/RUN/ASSEMBLIES/assem12oct14/final.assembly.fasta -Q ../Lanna/DATA/RUN/ASSEMBLIES/assem18sept14/final.assembly.fasta --filter --layout -p plot_one2one_mumLanna_Lwarner1k.delta
Here are my initial impressions based on the dot plots.
The genome assemblies contain mostly the same things (i.e. most scaffolds have at least something on the dot plots). Better assemblies appear to include the stuff in the worse assemblies, plus some more. Similar quality assemblies have a few unique scaffolds, but not too many (probably less than 15 or 20%).
There are a few inversions (blue on the plots), but not very many. We mostly have conserved order.
There are apparent duplications, insertions and deletions, but you have to zoom in on the plots to see these. So, I don't have a great feel for how common these are.
The L. idas nucmer run seems to have stalled. I need to try this again to see if the L. idas genome (the biggest assembly) has different properties.
p.s. The L. idas x L. melissa looks more or less like the others, but the L. idas x L. warner plot has a large X like I initially saw on others with less stringent (?) options. Though, the second diagonal (which makes the X) is much less dense with points. Not sure what to think.