polygenic modelling continued

Post date: Nov 15, 2013 8:54:0 PM

I re-ran the BSLMM analysis in gemma with two longer chains each with 2 million steps and a 500 thousand step burnin,

gemma -g geno_glaTrtAc.txt -p pheno_glaTraAc.txt -bslmm 1 -n 2 -o surv_glaTrtAc1 -rpace 20 -w 500000 -s 2000000

gemma -g geno_glaTrtAc.txt -p pheno_glaTraAc.txt -bslmm 1 -n 1 -o wgt_glaTrtAc1 -rpace 20 -w 500000 -s 2000000

gemma -g geno_glaTrtMs.txt -p pheno_glaTraMs.txt -bslmm 1 -n 2 -o surv_glaTrtMs1 -rpace 20 -w 500000 -s 2000000

gemma -g geno_glaTrtMs.txt -p pheno_glaTraMs.txt -bslmm 1 -n 1 -o wgt_glaTrtMs1 -rpace 20 -w 500000 -s 2000000

gemma -g geno_slaTrtAc.txt -p pheno_slaTraAc.txt -bslmm 1 -n 2 -o surv_slaTrtAc1 -rpace 20 -w 500000 -s 2000000

gemma -g geno_slaTrtAc.txt -p pheno_slaTraAc.txt -bslmm 1 -n 1 -o wgt_slaTrtAc1 -rpace 20 -w 500000 -s 2000000

gemma -g geno_slaTrtMs.txt -p pheno_slaTraMs.txt -bslmm 1 -n 2 -o surv_slaTrtMs1 -rpace 20 -w 500000 -s 2000000

gemma -g geno_slaTrtMs.txt -p pheno_slaTraMs.txt -bslmm 1 -n 1 -o wgt_slaTrtMs1 -rpace 20 -w 500000 -s 2000000

The results are in an output directory. My initial assessment is that results from the two chains are similar, but not quite as similar as I would like. Thus, I want to run longer chains and probably run three. At this point I will move to the dorc cluster for the analyse. Thus, I am moving the melGemma directory from Analyses to projects.

I wrote a script to summarize the genetic architecture results (summarizeGemma.pl). It basically combines the results from the two chains and provides posterior estimates. These preliminary results are in the file performGenArch.txt. They are also pasted below. I also have an R script to plot the results (plotArch.R).

character population plant parameter median q05 q95

surv gla Ac pve 0.1666624 0.0334592785 0.38894389

surv gla Ac pge 0.273896 0 0.900697805

surv gla Ac pi 0.00025573875 1.64558555e-05 0.0022379911

surv gla Ac n_gamma 21 0 182

surv gla Ms pve 0.3188502 0.104449215 0.572692565

surv gla Ms pge 0.2502141 0 0.884261355

surv gla Ms pi 0.0004768617 1.74836385e-05 0.00309096435

surv gla Ms n_gamma 38 0 250

surv sla Ac pve 0.8622349 0.7978201 0.9194744

surv sla Ac pge 0.9904566 0.9599446 0.999487

surv sla Ac pi 0.0002477627 0.0001789577 0.000338773

surv sla Ac n_gamma 24 18 29

surv sla Ms pve 0.10798045 0.0083019483 0.3405008

surv sla Ms pge 0.5019325 0 0.953210745

surv sla Ms pi 0.0001244681 1.5846332e-05 0.0018166082

surv sla Ms n_gamma 10 0 143

wgt gla Ac pve 0.37072775 0.086202737 0.649956325

wgt gla Ac pge 0.42366245 0.00580980245 0.94973699

wgt gla Ac pi 0.0003864875 1.9176272e-05 0.0024834994

wgt gla Ac n_gamma 30 1 194

wgt gla Ms pve 0.28257125 0.0251571375 0.720133975

wgt gla Ms pge 0.51652515 0.00627870765 0.960069295

wgt gla Ms pi 0.0002556043 1.8149086e-05 0.0023544626

wgt gla Ms n_gamma 20 1 181

wgt sla Ac pve 0.11865425 0.026708081 0.270771055

wgt sla Ac pge 0.26789645 0 0.8946898

wgt sla Ac pi 8.953675e-05 1.53336575e-05 0.00256071709999999

wgt sla Ac n_gamma 6 0 197

wgt sla Ms pve 0.44982005 0.033359743 0.9806428

wgt sla Ms pge 0.31036995 0 0.9084853

wgt sla Ms pi 0.0001858147 1.72807035e-05 0.00296390185

wgt sla Ms n_gamma 14 0 223

As with previous attempts to understand trait genetics, there is considerable uncertainty in parameters, particularly for adult weight which has a smaller sample size (dead larvae do not have adult weights). Nonetheless, there are a couple of interesting patterns. Most notably:

1. We explain a greater proportion of variance in survival in larvae reared on their natal host plant (GLA on Ms and SLA on Ac). This is particularly true for SLA on Ac (PVE > 0.8). This is counter to my original expectation that the narrow sense heritability of fitness components should be higher on a novel resource. In other words, consistent selection in an environment should remove variants that lower fitness in that environment. With that said, these individuals were reared in the lab, so perhaps this argument doesn't apply.

2. Almost all of the variation in survival for SLA on Ac that is explained by genetic variation is explained by individual SNV effects. In fact there are around 15-20 SNVs with very high posterior inclusion probabilities (> 0.5, several > 0.9). I have looked at these individual loci some. They tend to be loci with relatively low minor allele frequency such that there are only a handful of heterozygotes and most individuals are homozygous for the reference allele (the more common allele in these cases). In this population x treatment combination most individuals survived, but those that died are more likely to be heterozygous at these loci (thus rare heterozygotes are found in the few individuals that did not make it). This is interesting and actually consistent with expectations if selection acts against such variants (but not strongly enough so to get rid of them completely... perhaps the butterflies are just too drifty). I need to look at this more, but it is certainly interesting.

Page updated

Google Sites

Report abuse