Post date: Mar 02, 2017 6:11:22 PM
We tried to formalize the boundaries of the color locus and stripe locus (inversion) based on the dovetail genomes and mapping results. Here is what we know and how.
1. Mapping results give the following order for scaffolds on the brown genome:
2938 2568 1283 1989 693 2683 843 2560 2963 702.1 128 1845 1465 2625 3603
My focus is on 2963->702.1->128->1845.
2. I used correlations between SNP physical position and cM to orientate these scaffolds. Positive correlations would define a 'forward' orientation, whereas negative would define a 'reverse' orientation. Here are the answers:
2963 = rev
702.1 = ? (non-significant and noisy)
128 = for
1845 = for
However, because green scaffold 1576 spans (connects) 2963 to one arm of 702.1, we know that 702.1 is in reverse orientation. So we really have,
2963 (rev) -> 702.1 (rev) -> 128 (for) -> 1845 (for)
This means that 2963 and 702.1 should be reversed in their orientation for plotting results.
3. Next, I matched green to brown scaffolds and tried to orientate the green scaffolds as I did the brown ones (based on the correlation, positive or negative, between physical and cM position within scaffolds but for the green parent). Here is what we have,
Once flipped, brown 2963 aligns first to green 4792 (for) and then to 1575 (rev). This covers all (essentially) of 2963.
1575 (rev) continues on across 14-4 megabp on brown 702.1 (which has been flipped, hence starting at 14 mbp). This is followed by many small scaffolds with ambiguous orientation and many smaller inversion. The least negligible of these are given below with start and stop locations in mbp (these are green scaffold numbers):
353 (0.1,0.1) (?)
460 (0.2,0.4) (r)
2758 (0.4,0.4) (?)
1919 (0.4,0.5) (f)
842 (0.5,0.7) (r)
3655 (0.8,0.9) (r?)
2860 (0.9,1.0) (r)
2942 (1.0,1.1) (?)
4004 (1.1,1.2) (f?)
498 (1.2,1.3) (?)
505 (1.3,1.5) (f)
5137 (1.5,1.6) (?)
2647 (1.8,1.9) (?)
3746 (2.6,3.5) (f)
3955 (3.3,3.4) (f,r)
1643 (3.4,3.6) (r)
2456 (3.6,4.1) (r)
The same pattern picks up on scaffold 128, with many green scaffolds with ambiguous and alternating orientations. This continues to about 5 mbp on 128, we then have nothing until 6.4 mbp (that is nothing in the indel). After that we have a single green scaffold that spans the rest of brown 128 (from 6.4 to 15.8 mbp). It is in forward orientation and colinear with the rest of 128.
Finally, brown scaffold 1845 coincides perfectly with and is colinear with 1428 (rev).
Here are a few notes on the order and orientation of the green scaffolds in general, but this isn't very useful:
## green order, lg 5 = lg 8
6 1083 1104 3982 4995 5647 1774 1501 827 5643 998 2106 4792 3030 4380 1575 1643 505 1277 1004 1428 1384 5135 2154 130 3746 842 4214 1014 460 552 2456 292 3955 3655
## focus on big matching ones: 1428 130 4214 1575
1428 = rev (?)
130 = for (?)
4214 = for
1575 = for
4. So, what does it mean. My first thought was that we had nailed the inversion given the switches in orientation on 702.1 and 128 near the big scaffolds. But, as I looked at this more I realized that the whole inversion was riddled with apparent structural variation. It looked like the whole region had been smashed into many pieces and put back together haphazardly. I now suspect that this means our green bug was a green/dark heterozygote making this region a nightmare to assemble (because of the inversion). Thus, I think we can define the inversion boundaries and can be almost certain that it is an inversion, but that the picture isn't quite as clear because of the heterozygote.
5. Based on this, we will define two loci.
mel is the color locus responsible for green vs. brown. Based on the GWA results and the comparative mapping, we will define it as the point where the alignment between green 1575 and brown 702.1 ends (left edge) and the indel on 128 (right edge). Specifically, this is:
702.1 4,139,489 to 1 and 128 1 to 6,414,835
stripy is the stripe locus. We know a bit less about it now, but the GWA signal is strong on 128 outside of mel. We thus define stripy as all of brown scaffold 128 not part of mel. That is,
128 6,414,836 to 15,784,042
Here is a cartoon figure of the region and loci.