Selection for Ecology Letters (FHA 2011) bugs with six genotype model

Post date: Mar 13, 2017 3:11:27 AM

I inferred selection (fitness values) for the Ecology Letters experiment using the same six genotype assignments used for FHA (these are the same bugs, and having the 2013 bugs gives us 'phenotypes'). The data and results are in /uufs/chpc.utah.edu/common/home/u6000989/projects/timema_fluct/genomic_change_dark_morph/sel_fha/.

First, given the nature of the data (survival for individuals), I used a straight Bayesian approach to infer independent fitness values for each genotype across all bugs and for A and C separately. Specifically, I assumed that y ~ binom(p, n) for each genotype where y is the number of survivors, p is absolute fitness and n is the number of individuals. I placed a Jeffery's beta prior on p (giving a beta posterior). This might be asking too much of the data, and in general when I finish this round of analyses I want to think about whether we need to constrain things more in terms of fitness values (I like not doing this much given the complexity we are seeing, but I think it is also pushing us into marginal significance land).

Quantitative summaries of the results follow. This includes genotype frequencies, absolute fitness values, relative fitness values (relative to m/s for consistency, but one could argue this isn't the most relevant reference point in the experiment), and matrixes giving the posterior probability that the row genotype has a higher fitness than the column genotype for all pairs of genotypes. All of these things are given for all bugs, A and C, and I also give the difference in fitness values for A vs C (A-C). Plots in /uufs/chpc.utah.edu/common/home/u6000989/projects/timema_fluct/genomic_change_dark_morph/sel_fha/ the posterior distribution for absolute and relative (rel) fitness values for all, A and C (see ecol_*pdf).

Here are the key take home points/things to think about (some have been made already):

1. We have an excess of hets. at the beginning, but the frequency of m/s doesn't go up (or change) much during the experiment. m/g actually takes quite a nose dive on A (and thus all). Thus, what we are seeing here is decidedly different from the change between 2011 and 2013 at FHA. This almost certainly reflects differences in selection acting at different points in time/development. I agree that the early selection (favoring m/s) could still be ecological, just not predation. Do birds even eat the really young ones much?

2. General trends make sense, but there is lots of uncertainty. Point estimates show s/s and then m/s doing best on A with m/g (and probably g/g) really sucking. On C, g/s does best, m/s is almost identical to what it was on A, and s/s does worse than on A, but not horrible. In general (see matrix of pp differences in fitness), there is more evidence of variation in fitness among genotypes on A than C.

3. We come the closest too (or hit) 'significance' in terms of s/s>m/g and m/s > m/g on A. We are close on s/s > m/g overall, but mostly because of A.

Quantitative summaries.

Genotype frequencies

all surv A A-surv C C-surv

m/g 0.050 0.02857143 0.052 0.01449275 0.048 0.04225352

s/s 0.286 0.32142857 0.308 0.37681159 0.264 0.26760563

g/g 0.006 0.00000000 0.008 0.00000000 0.004 0.00000000

m/m 0.042 0.02857143 0.056 0.04347826 0.028 0.01408451

m/s 0.500 0.50714286 0.480 0.49275362 0.520 0.52112676

g/s 0.116 0.11428571 0.096 0.07246377 0.136 0.15492958

Absolute fitnesses

All

m/g s/s g/g m/m m/s g/s

50% 0.16654056 0.3147438 0.0659015457 0.19467830 0.2844256 0.2766948

5% 0.06950786 0.2531245 0.0006376711 0.08106451 0.2391952 0.1887122

95% 0.30796153 0.3807947 0.4433619239 0.35461027 0.3323201 0.3778218

2.5% 0.05710492 0.2388039 0.0001711069 0.06540925 0.2318952 0.1738823

97.5% 0.33762003 0.3927150 0.5287999141 0.38670641 0.3417446 0.4009285

m/g s/s g/g m/m m/s g/s

50% 0.086472574 0.3371244 0.0950873728 0.21844533 0.2837109 0.21381790

5% 0.014185624 0.2524374 0.0008077931 0.08005474 0.2192916 0.09975317

95% 0.265923276 0.4294884 0.5862767836 0.42773102 0.3552697 0.36315438

2.5% 0.008619063 0.2380591 0.0002144854 0.06238289 0.2075435 0.08603833

97.5% 0.305427919 0.4462016 0.6778319677 0.47284885 0.3685507 0.39733090

m/g s/s g/g m/m m/s g/s

50% 0.25661282 0.2888618 0.1673171237 0.16549955 0.2847834 0.3271446

5% 0.09594155 0.2052188 0.0015132652 0.02636007 0.2239466 0.2033485

95% 0.48754164 0.3836363 0.7763974053 0.43872768 0.3532794 0.4622906

2.5% 0.07558956 0.1889371 0.0004565134 0.01700831 0.2117232 0.1801812

97.5% 0.53363060 0.4056299 0.8605433871 0.49978970 0.3658921 0.4886025

Difference in fitness between A and C (A-C)

m/g s/s g/g m/m m/s g/s

50% -0.16153781 0.04895248 -0.04384719 0.04936913 -0.0005371716 -0.11168419

5% -0.40773575 -0.07771347 -0.66561068 -0.24784744 -0.0962984783 -0.29132973

95% 0.06984025 0.17205473 0.43267651 0.30685578 0.0899400494 0.08235759

2.5% -0.45684232 -0.10118745 -0.76304686 -0.32869565 -0.1136915641 -0.32071808

97.5% 0.11326533 0.19568135 0.53520662 0.35934254 0.1091567399 0.11679490

Relative fitness (relative to m/s)

All

m/g s/s g/g m/m m/s g/s

50% 0.5871749 1.1101957 0.2335931921 0.6787582 1 0.9743156

5% 0.2423807 0.8480624 0.0022379966 0.2879811 1 0.6463019

95% 1.1138541 1.4266511 1.5555498230 1.2907594 1 1.3936069

2.5% 0.1991952 0.8013052 0.0005693897 0.2263354 1 0.5949102

97.5% 1.2333549 1.4944609 1.9022961700 1.4243067 1 1.4738585

m/g s/s g/g m/m m/s g/s

50% 0.30690205 1.1838164 0.3365187457 0.7691940 1 0.7586085

5% 0.04919161 0.8323440 0.0028457401 0.2812096 1 0.3463115

95% 0.96271987 1.7006059 2.1175310248 1.5685744 1 1.3535032

2.5% 0.03010222 0.7679442 0.0007915034 0.2171975 1 0.2944043

97.5% 1.12088184 1.8029492 2.4926849176 1.7505827 1 1.5107698

m/g s/s g/g m/m m/s g/s

50% 0.9016287 1.0111077 0.589305052 0.58154883 1 1.1459700

5% 0.3310450 0.6741726 0.005345812 0.09252174 1 0.6853189

95% 1.7763973 1.4734183 2.765228014 1.61407003 1 1.7502289

2.5% 0.2660189 0.6178542 0.001594921 0.05887519 1 0.6148360

97.5% 1.9564519 1.5770885 3.113405114 1.84125363 1 1.8868777

Posterior probs. that row genotype has higher fitness than column genotype

All

m/g s/s g/g m/m m/s g/s

m/g NA 0.0538 0.7246 0.4052 0.0918 0.1344

s/s 0.9462 NA 0.8910 0.8804 0.7430 0.7060

g/g 0.2754 0.1090 NA 0.2442 0.1320 0.1462

m/m 0.5948 0.1196 0.7558 NA 0.1794 0.2216

m/s 0.9082 0.2570 0.8680 0.8206 NA 0.5438

g/s 0.8656 0.2940 0.8538 0.7784 0.4562 NA

m/g s/s g/g m/m m/s g/s

m/g NA 0.0204 0.4756 0.1582 0.0430 0.1508

s/s 0.9796 NA 0.8094 0.8220 0.7850 0.8828

g/g 0.5244 0.1906 NA 0.3166 0.2358 0.3172

m/m 0.8418 0.1780 0.6834 NA 0.2990 0.5252

m/s 0.9570 0.2150 0.7642 0.7010 NA 0.7684

g/s 0.8492 0.1172 0.6828 0.4748 0.2316 NA

m/g s/s g/g m/m m/s g/s

m/g NA 0.4008 0.6066 0.6900 0.4096 0.3258

s/s 0.5992 NA 0.6422 0.7772 0.5198 0.3476

g/g 0.3934 0.3578 NA 0.5044 0.3628 0.3230

m/m 0.3100 0.2228 0.4956 NA 0.2316 0.1762

m/s 0.5904 0.4802 0.6372 0.7684 NA 0.3216

g/s 0.6742 0.6524 0.6770 0.8238 0.6784 NA

R code (fhaEcolS.R).

load("fhaAbc6g.rdat")

## DO NOTE RERUN, from workspace

g0x<-g0[c(mel,stripy),]

g1x<-g1[c(mel,stripy),]

pcall<-prcomp(rbind(t(g0x),t(g1x)),center=TRUE,scale=FALSE)

oall<-kmeans(pcall$x[,1:2],centers=6,nstart=100,iter.max=200)

pdf("pca6.pdf",width=5,height=5)

plot(pcall$x[,1],pcall$x[,2],type="n")

text(pcall$x[,1],pcall$x[,2],oall$cluster,col=c(rep("gray",500),c("brown","green","black")[ph]))

dev.off()

## groups based on pca

## 1 = m/g, 2 = s/s, 3 = g/g, 4 = m/m, 5 = m/s, 6 = g/s

gen0<-oall$cluster[1:500]

gfr0<-table(gen0)/500

m0<-(sum(gen0==4) + 0.5 * sum(gen0==1 | gen0==5))/(length(gen0))

s0<-(sum(gen0==2) + 0.5 * sum(gen0==5 | gen0==6))/(length(gen0))

r0<-(sum(gen0==3) + 0.5 * sum(gen0==1 | gen0==6))/(length(gen0))

p0<-c(m0,s0,r0)

## survival data

surv<-scan("SurvivalGbs.txt")

genS<-gen0[surv==1]

gfrS<-table(genS)/140

gfrS<-c(gfrS[1:2],0,gfrS[3:5])

## host data

host<-read.table("ecolGbsHost.txt",header=F)

genA<-gen0[host[,1]=='A']

genC<-gen0[host[,1]=='C']

genAS<-gen0[host[,1]=='A' & surv==1]

genCS<-gen0[host[,1]=='C' & surv==1]

gfrA<-table(genA)/250

gfrC<-table(genC)/250

gfrAS<-table(genAS)/69

gfrAS<-c(gfrAS[1:2],0,gfrAS[3:5])

gfrCS<-table(genCS)/71

gfrCS<-c(gfrCS[1:2],0,gfrCS[3:5])

sumS<-cbind(gfr0,gfrS,gfrA,gfrAS,gfrC,gfrCS)

rownames(sumS)<-c("m/g","s/s","g/g","m/m","m/s","g/s")

colnames(sumS)<-c("all","surv","A","A-surv","C","C-surv")

## posteriors under binom-beta

cntsAll<-table(gen0)

survAll<-table(genS)

survAll<-c(survAll[1:2],0,survAll[3:5])

probAll<-matrix(NA,nrow=5000,ncol=6)

for(i in 1:6){

a<-0.5 + survAll[i]

b<-0.5 + cntsAll[i]-survAll[i]

probAll[,i]<-rbeta(5000,a,b)

}

wall<-apply(probAll,2,quantile,probs=c(0.5,0.05,0.95,0.025,0.975))

colnames(wall)<-c("m/g","s/s","g/g","m/m","m/s","g/s")

## density plots

cs<-c("green","darkgray","green","brown","black","darkgray")

ll<-c(2,1,1,1,2,2)

pdf("ecol_fitness.pdf",width=6,height=6)

par(mar=c(5,5,0.5,0.5))

plot(density(probAll[,2],from=0,to=1),xlim=c(0,1),ylim=c(0,15),col=cs[2],lty=ll[2],xlab="absolute fitness",cex.lab=1.4,cex.axis=1.1,main="")

for(i in c(1,3,4,5,6)){

lines(density(probAll[,i],from=0,to=1),col=cs[i],lty=ll[i])

}

legend(0.7,15,c("m/g","s/s","g/g","m/m","m/s","g/s"),col=cs,lty=ll)

dev.off()

## posteriors under binom-beta

cntsA<-table(genA)

survA<-table(genAS)

survA<-c(survA[1:2],0,survA[3:5])

probA<-matrix(NA,nrow=5000,ncol=6)

for(i in 1:6){

a<-0.5 + survA[i]

b<-0.5 + cntsA[i]-survA[i]

probA[,i]<-rbeta(5000,a,b)

}

wA<-apply(probA,2,quantile,probs=c(0.5,0.05,0.95,0.025,0.975))

colnames(wA)<-c("m/g","s/s","g/g","m/m","m/s","g/s")

## density plots

cs<-c("green","darkgray","green","brown","black","darkgray")

ll<-c(2,1,1,1,2,2)

pdf("ecolA_fitness.pdf",width=6,height=6)

par(mar=c(5,5,0.5,0.5))

plot(density(probA[,2],from=0,to=1),xlim=c(0,1),ylim=c(0,10),col=cs[2],lty=ll[2],xlab="absolute fitness",cex.lab=1.4,cex.axis=1.1,main="")

for(i in c(1,3,4,5,6)){

lines(density(probA[,i],from=0,to=1),col=cs[i],lty=ll[i])

}

legend(0.7,10,c("m/g","s/s","g/g","m/m","m/s","g/s"),col=cs,lty=ll)

dev.off()

## posteriors under binom-beta

cntsC<-table(genC)

survC<-table(genCS)

survC<-c(survC[1:2],0,survC[3:5])

probC<-matrix(NA,nrow=5000,ncol=6)

for(i in 1:6){

a<-0.5 + survC[i]

b<-0.5 + cntsC[i]-survC[i]

probC[,i]<-rbeta(5000,a,b)

}

wC<-apply(probC,2,quantile,probs=c(0.5,0.05,0.95,0.025,0.975))

colnames(wC)<-c("m/g","s/s","g/g","m/m","m/s","g/s")

## density plots

cs<-c("green","darkgray","green","brown","black","darkgray")

ll<-c(2,1,1,1,2,2)

pdf("ecolC_fitness.pdf",width=6,height=6)

par(mar=c(5,5,0.5,0.5))

plot(density(probC[,2],from=0,to=1),xlim=c(0,1),ylim=c(0,10),col=cs[2],lty=ll[2],xlab="absolute fitness",cex.lab=1.4,cex.axis=1.1,main="")

for(i in c(1,3,4,5,6)){

lines(density(probC[,i],from=0,to=1),col=cs[i],lty=ll[i])

}

legend(0.7,10,c("m/g","s/s","g/g","m/m","m/s","g/s"),col=cs,lty=ll)

dev.off()

## density plots

cs<-c("green","darkgray","green","brown","black","darkgray")

ll<-c(2,1,1,1,2,2)

pdf("ecol_fitnessDif.pdf",width=6,height=6)

par(mar=c(5,5,0.5,0.5))

plot(density(probA[,2]-probC[,2],from=-.5,to=.5),xlim=c(-.5,.5),ylim=c(0,10),col=cs[2],lty=ll[2],xlab="fitness dif (A-C)",cex.lab=1.4,cex.axis=1.1,main="")

for(i in c(1,3,4,5,6)){

lines(density(probA[,i]-probC[,i],from=-.5,to=.5),col=cs[i],lty=ll[i])

}

legend(-0.5,10,c("m/g","s/s","g/g","m/m","m/s","g/s"),col=cs,lty=ll)

dev.off()

## relative fitness density plots

cs<-c("green","darkgray","green","brown","black","darkgray")

ll<-c(2,1,1,1,2,2)

pdf("ecolA_rel_fitness.pdf",width=6,height=6)

par(mar=c(5,5,0.5,0.5))

plot(density(probA[,2]/probA[,5],from=0,to=2),xlim=c(0,2),ylim=c(0,2),col=cs[2],lty=ll[2],xlab="relative fitness",cex.lab=1.4,cex.axis=1.1,main="")

for(i in c(1,3,4,6)){

lines(density(probA[,i]/probA[,5],from=0,to=2),col=cs[i],lty=ll[i])

}

abline(v=1,col=cs[5],lty=ll[5],lwd=2)

legend(1.5,2,c("m/g","s/s","g/g","m/m","m/s","g/s"),col=cs,lty=ll)

dev.off()

cs<-c("green","darkgray","green","brown","black","darkgray")

ll<-c(2,1,1,1,2,2)

pdf("ecolC_rel_fitness.pdf",width=6,height=6)

par(mar=c(5,5,0.5,0.5))

plot(density(probC[,2]/probC[,5],from=0,to=2),xlim=c(0,2),ylim=c(0,2),col=cs[2],lty=ll[2],xlab="relative fitness",cex.lab=1.4,cex.axis=1.1,main="")

for(i in c(1,3,4,6)){

lines(density(probC[,i]/probC[,5],from=0,to=2),col=cs[i],lty=ll[i])

}

abline(v=1,col=cs[5],lty=ll[5],lwd=2)

legend(1.5,2,c("m/g","s/s","g/g","m/m","m/s","g/s"),col=cs,lty=ll)

dev.off()

cs<-c("green","darkgray","green","brown","black","darkgray")

ll<-c(2,1,1,1,2,2)

pdf("ecol_rel_fitness.pdf",width=6,height=6)

par(mar=c(5,5,0.5,0.5))

plot(density(probAll[,2]/probAll[,5],from=0,to=2),xlim=c(0,2),ylim=c(0,3),col=cs[2],lty=ll[2],xlab="absolute fitness",cex.lab=1.4,cex.axis=1.1,main="")

for(i in c(1,3,4,6)){

lines(density(probAll[,i]/probAll[,5],from=0,to=2),col=cs[i],lty=ll[i])

}

abline(v=1,col=cs[5],lty=ll[5],lwd=2)

legend(1.5,3,c("m/g","s/s","g/g","m/m","m/s","g/s"),col=cs,lty=ll)

dev.off()

## abs fitness pp matrix

ppmat<-matrix(NA,nrow=6,ncol=6)

for(i in 1:6){for(j in 1:6){

if(i != j){

ppmat[i,j]<-mean(probAll[,i] > probAll[,j])

}

}}

ppmatA<-matrix(NA,nrow=6,ncol=6)

for(i in 1:6){for(j in 1:6){

if(i != j){

ppmatA[i,j]<-mean(probA[,i] > probA[,j])

}

}}

ppmatC<-matrix(NA,nrow=6,ncol=6)

for(i in 1:6){for(j in 1:6){

if(i != j){

ppmatC[i,j]<-mean(probC[,i] > probC[,j])

}

}}

rownames(ppmat)<-c("m/g","s/s","g/g","m/m","m/s","g/s")

colnames(ppmat)<-c("m/g","s/s","g/g","m/m","m/s","g/s")

rownames(ppmatA)<-c("m/g","s/s","g/g","m/m","m/s","g/s")

colnames(ppmatA)<-c("m/g","s/s","g/g","m/m","m/s","g/s")

rownames(ppmatC)<-c("m/g","s/s","g/g","m/m","m/s","g/s")

colnames(ppmatC)<-c("m/g","s/s","g/g","m/m","m/s","g/s")

save(list=ls(),file="ecolS.rdat")

Page updated

Google Sites

Report abuse