Post date: Nov 25, 2014 4:44:6 PM
About 100 piMass runs with modal and mean genotypes finished. Half of these are labelled as only including SNPs with MAF > 5%, but this isn't true (the -exclude-maf option doesn't do anything). The results are in /home/A01963476/projects/timema_wgexperiment/gemma/output_pimass and home/A01963476/projects/timema_wgexperiment/gemma/results/. Here is the summary that I sent to Patrik:
So, I ended up setting up 100 chains for the piMASS analysis. A few of these crapped out, so we have ~90 chains each for the Adenostoma and Ceanothus analyses. I then split these sets of chains in two (i.e. two sets of about 45 chains) to see how consistent the results are. In short, we are doing a pretty reasonable job of getting the same posterior of PVE and no. of SNPs but not so good for the PIPs and regression coefficients. The correlation between the posterior probabilities for PVE and no. of SNPs is > 0.99 (posteriors and correlation for posteriors attached; the posteriors are shown as histograms because I noticed the violin plots were over-smoothing things). On the other hand, correlations for PIPs (~0.2), betas (~0.7) and model-averaged betas (i.e. PIPs x betas; 0.07) are much lower. Perhaps this shouldn't be surprising, you need considerably fewer MCMC iterations to infer the higher-level parameters than to estimate betas for 8 million SNPs.