Post date: Mar 30, 2017 2:40:21 AM
Sam gave me a list of the 58 top parallel SNPs from the BayPass L. melissa analysis. This and the full set of 206,028 SNPs are linked to /uufs/chpc.utah.edu/common/home/u6000989/projects/lmelissa_hostAdaptation/lginfo/ as bfmeansWithScafPosition_all and top58snps_scafPosBayesfactors, respectively. The linkage map is also linked to the same directory.
I ran getLgCount.pl on both files to create files that prepend the LG number (or NA) to each SNP. I then made tables summarizing these with cut and uniq. Here are the counts by LG:
cut -f 1 -d " " lg_bfmeansWithScafPosition_all | sort | uniq -c
7880 1
4128 10
5382 11
4670 12
3750 13
3438 14
3638 15
4447 16
3800 17
3083 18
2663 19
7470 2
2506 20
1304 21
950 22
8636 23
5973 3
5810 4
5678 5
5978 6
5111 7
5275 8
4239 9
100219 NA
cut -f 1 -d " " lg_top58snps_scafPosBayesfactors | sort | uniq -c
3 1
3 10
2 11
1 15
1 17
4 18
5 2
1 21
6 23
1 3
1 4
2 5
1 6
1 8
1 9
25 NA
Based on this 8636/206028 or 0.04191663 of the total SNPs are on LG 23 (the Z) and 6/58 0.1034483 are on Z for the top 58. This is a significant enrichment:
dbinom(x=6,size=58,prob=p)
#[1] 0.0236854
0.1034483/p
#[1] 2.467953
Here is code for a plot, we actually see an excess of LG 2, 18, and Z linked SNPs.
pdf("sexLinkage.pdf",width=5.5,height=5.5)
par(mar=c(5.5,5.5,0.5,0.5))
plot(cnt[1:23,2],cnt[1:23,1],type='n',xlab="No. of SNPs",ylab="No. parallel SNPs",cex.lab=1.4,cex.axis=1.1)
cs<-rep("darkgray",23)
cs[which(dbinom(x=cnt[1:23,1],size=58,prob=p) <0.05)]<-"red"
cnt[16,3]<-"Z"
text(cnt[1:23,2],cnt[1:23,1],cnt[1:23,3],col=cs)
o<-lm(cnt[1:23,1]~cnt[1:23,2])
abline(o$coefficients)
dev.off()