Determine Proper Hyperparameters
Last update: 22 November 2020
Last update: 22 November 2020
VSNP=VPh2∕Σ2pq=VPh2∕(mH̅), where VSNP is the variance explained by one SNP, VP is phenotypic variance, h2 is heritability, p (q) is the frequency of one allele (the other allele), m is the number of SNPs, and H̅ is the average heterozygosity (usually a number around 0.3). Using this formula, we can get a rough estimate of variance explained by one SNP.
Shape parameter usually has little effect on prediction accuracy, so we can always give shape a small value. However, model fitting may be sensitive to scale. In practice, we can set scale parameter to b=VSNP and then search for optimal scale value within a small range around b=VSNP by cross validation.
When hyper-parameter optimization is turned on by the --hyper_opt option, we need to set rate parameter of the exponential hyper-prior (by --snp_hyper_exp_rate). Its optimal value can be determined in the same way as the scale parameter of inverse-gamma prior.
We would like to set squared scale parameter of half-Cauchy prior to a value slightly larger than VSNP, which results in a weakly informative prior. Since there may be some QTLs of relatively large effect, setting the squared scale to A2=10VSNP or even A2=100VSNP can usually be a safer choice. In fact, the half-Cauchy model can still work well, when an unreasonably large scale is used (approaching a non-informative prior for variance parameter). This is a great feature of half-Cauchy prior compared to inverse-gamma prior.
Generally, it is not necessary to turn on hyper-parameter optimization for the half-Cauchy model.