association between Fst and GWA effects, whole genomes

Post date: Jan 09, 2015 8:29:9 PM

Several weeks ago I started testing for an association between Fst (between different hosts) and the difference in model averaged effects. The input files I made are in /home/A01963476/projects/timema_wgexperiment/fstAnalyses/. We had found such correlations (though very weak) in the GBS data, and interestingly the strength of the correlation was associated with gene flow between pops. Here is the summary I sent Patrik:

Only about 10% of the SNPs are shared between the wild and experimental populations, meaning our correlations between Fst and pips or betas (raw or model-averaged) are based on about 370,000 SNPs. While plenty of SNPs, I was a bit surprised that the overlap wasn't higher. This argues strongly for calling variants in a combined analyses when making comparisons across data sets (it might be worth doing here?).

At a first pass, we effectively have no correlations between the Fsts (including for the allopatric population pair) and beta, model-averaged betas, pips, or the absolute difference in model-averaged betas. So, not much here.

But, perhaps one shouldn't expect an overall correlation, rather we should just expect a correlation for those SNPs with the greatest model-averaged effects. Now I will test for that.

Still nothing. No correlations between Fst (any population pair) and the difference in betas or model-averaged betas between treatments. Nor are their correlations for the top 1%, 0.1% or 0.01% of loci (based on betas or model-averaged betas). Nor are the same loci found disproporinately in the top 1% (or 0.1 or 0.01%) for mean Fst (or Fst for any population pair) and betas (or model-averaged betas). This one is done (with the caveat that this might be worth revisiting with the set of SNPs called together)! Here is the R code.

## testing for an association between Fst and model-averaged effects

nl<-371457

np<-14

## read in data

## lg ord scaf pos fstHV fstMR1 fstR12 fstLaPrc betaA mbA pipA betaA mbA pipA

parm<-matrix(scan("genomeFstPimass.txt",n=nl*np,sep=" "),nrow=nl,ncol=np,byrow=T)

## dif in betas

dmb<-abs(parm[,10]-parm[,13])

## no correlation between dmb and fst for any population when considering all loci

## top loci

q<-quantile(dmb,probs=c(0.99,0.999,0.9999),na.rm=T) ## top 1%, 0.1% 0.01%

## top 1%

x<-which(dmb >= q[1]) ## 3330 loci

cor.test(dmb[x],parm[x,5]) ## no cor. for any

## top 0.1%

x<-which(dmb >= q[2]) ## 333 loci

cor.test(dmb[x],parm[x,6])

#t = -2.0326, df = 331, p-value = 0.04289

#alternative hypothesis: true correlation is not equal to 0

#95 percent confidence interval:

# -0.215932459 -0.003600727

#sample estimates:

# cor

#-0.1110335

## top 0.01%

x<-which(dmb >= q[2]) ## 34 loci

cor.test(dmb[x],parm[x,5])## none

## is their an enrichment for top loci in both sets

mnfst<-apply(parm[,5:8],1,mean,na.rm=T) ## mean fst

qf<-quantile(mnfst,probs=c(0.99,0.999,0.9999),na.rm=T)

## top 1% in each

x<-which(dmb >= q[1] & mnfst >= qf[1])## 33 loci

x<-which(dmb >= q[2] & mnfst >= qf[2])## none

## what about model-avg betas

dmb<-abs(parm[,11]-parm[,14])

## no correlations

## 42 loci in top 1% in each

## permutation test

num<-numeric(1000)

for(i in 1:1000){

pmnfst<-sample(mnfst,length(mnfst),replace=F)

x<-which(dmb >= q[1] & pmnfst >= qf[1])

num[i]<-length(x)

}

mean(num >= 42)

## 0.077

## fst for individual pop pairs

#######################

qf<-quantile(parm[,5],probs=c(0.99,0.999,0.9999),na.rm=T)

x<-which(dmb >= q[1] & parm[,5] >= qf[1])

obs<-length(x)

num<-numeric(1000)

for(i in 1:1000){

pmnfst<-sample(parm[,5],length(parm[,5]),replace=F)

x<-which(dmb >= q[1] & pmnfst >= qf[1])

num[i]<-length(x)

}

mean(num >= obs)

## p = 0.892

#######################

qf<-quantile(parm[,6],probs=c(0.99,0.999,0.9999),na.rm=T)

x<-which(dmb >= q[1] & parm[,6] >= qf[1])

obs<-length(x)

num<-numeric(1000)

for(i in 1:1000){

pmnfst<-sample(parm[,6],length(parm[,6]),replace=F)

x<-which(dmb >= q[1] & pmnfst >= qf[1])

num[i]<-length(x)

}

mean(num >= obs)

## p = 0.236

#######################

qf<-quantile(parm[,7],probs=c(0.99,0.999,0.9999),na.rm=T)

x<-which(dmb >= q[1] & parm[,7] >= qf[1])

obs<-length(x)

num<-numeric(1000)

for(i in 1:1000){

pmnfst<-sample(parm[,7],length(parm[,7]),replace=F)

x<-which(dmb >= q[1] & pmnfst >= qf[1])

num[i]<-length(x)

}

mean(num >= obs)

## p = 0.394

#######################

qf<-quantile(parm[,8],probs=c(0.99,0.999,0.9999),na.rm=T)

x<-which(dmb >= q[1] & parm[,8] >= qf[1])

obs<-length(x)

num<-numeric(1000)

for(i in 1:1000){

pmnfst<-sample(parm[,8],length(parm[,8]),replace=F)

x<-which(dmb >= q[1] & pmnfst >= qf[1])

num[i]<-length(x)

}

mean(num >= obs)

## p = 0.166

Page updated

Google Sites

Report abuse