Post date: Nov 23, 2015 5:19:39 AM
I tested for elevated LD between relative to within morphs for the elevated Fst region in FHA and N1 as defined specifically by the HMM. Patrik, we discussed connecting the two HMM regions and I think that makes sense for other analyses, but here it seemed better just to use the HMM results directly as both LD and HMM apply to the same 20kb windows (i.e., connecting the two regions seems like an unnecessary complication). Specifically, for FHA and N1 I calculated:
mean(D0_M.GS - (D0_M + D0_G.S)/2) for the subset of windows assigned to the high HMM state; note that my notation follows Doro's column headings, and that for now I have only considered D0 and have focused on melanistic vs. green (striped or unstriped). I then ran a permutation test that essentially slides the HMM regions around (while keeping the structure of having two regions) to ask whether mean(D0_M.GS - (D0_M + D0_G.S)/2) for the actual high HMM regions was greater than expected for random regions on LG8. Here are the results:
FHA: obs = 0.0591, null = 0.00723, 95th quantile = 0.0390, p = 0.012
N1: obs = 0.0311, null = 0.00236, 95th quantile = 0.0251, p = 0.020
Thus, we have evidence of elevated LD between vs. within morphs in the high Fst HMM regions. Given Doro's results based directly on Fst (not the HMM) this is not surprising, but it is nice and clean. I can of course run additional contrasts etc., but I wanted to keep things simple as a first pass and focus on what I think we are most interested in.
Here is the R code from /labs/evolution/projects/timema_speciation/tcristinae/fstdoro/testLd.R
load("hmm2s_FHA_RxGISpar7.rwsp")
fstfha<-fit[[1]]
ldfha<-read.table("BurrowD_binnedSNPs_morphs_FHA.txt",header=TRUE)
L<-dim(ldfha)[1]
fstld<-rep(NA,L)
for(i in 1:L){
x<-which(nei[,1]==8 & nei[,2]==ldfha[i,2] & nei[,4]==ldfha[i,4])
if(length(x)==1){
fstld[i]<-fstfha[x]
}
}
x<-which(fstld==1)
obs<-mean(ldfha$D0_M.GS[x]-(ldfha$D0_M[x] + ldfha$D0_G.S[x])/2,na.rm=TRUE)
null<-rep(NA,1000)
for(i in 1:1000){
st<-sample(1:L,1)
y<-c(fstld[st:L],fstld[-c(st:L)])
x<-which(y==1)
null[i]<-mean(ldfha$D0_M.GS[x]-(ldfha$D0_M[x] + ldfha$D0_G.S[x])/2,na.rm=TRUE)
}
load("hmm2s_N1_RxGISpar12.rwsp")
fstN1<-fit[[1]]
ldN1<-read.table("BurrowD_binnedSNPs_morphs_N1.txt",header=TRUE)
L<-dim(ldN1)[1]
fstld<-rep(NA,L)
for(i in 1:L){
x<-which(nei[,1]==8 & nei[,2]==ldfha[i,2] & nei[,4]==ldN1[i,4])
if(length(x)==1){
fstld[i]<-fstN1[x]
}
}
x<-which(fstld==1)
obs<-mean(ldN1$D0_M.GS[x]-(ldN1$D0_M[x] + ldN1$D0_G.S[x])/2,na.rm=TRUE)
null<-rep(NA,1000)
for(i in 1:1000){
st<-sample(1:L,1)
y<-c(fstld[st:L],fstld[-c(st:L)])
x<-which(y==1)
null[i]<-mean(ldN1$D0_M.GS[x]-(ldN1$D0_M[x] + ldN1$D0_G.S[x])/2,na.rm=TRUE)
}