In SSGP, whole genome markers can be split into groups in a user-defined manner, and each group of markers is given a common effect variance.
Additionally, each marker has a pre-specified weight for which the rule can be flexibly assigned, e.g. based on minor allele frequency or linkage disequilibrium pattern.
Define two SNP sets based on gene annotation, i.e., genic SNPs and intergenic SNPs
Divide whole genome SNPs into SNP sets based on P-values from single SNP GWAS
Divide SNPs based on pathways
Give more weight to low-MAF SNPs by specifying Beta(1,10), as low-MAF QTLs may have larger effects
Besides inverse-gamma distribution that has been widely used as prior distribution for variance, SSGP can also use half-Cauchy prior distribution which usually works better when improper hyper-parameter values are used.
Variational Bayesian method with parameter expanded, which is orders of magnitude faster than MCMC
Seconds for 2k individuals and 60k SNPs on PCs
Feasible for analysis of large data sets, e.g., a few hours for 20k individuals and 760k SNPs with 20 threads
SSGP can get similar or higher prediction accuracy than the most commonly reported methods (the Bayesian alphabet), even when little external biological information is used. (Part of the results have been presented in the PAG XXV conference, but more results are yet to be published.)
Jicai Jiang developed the software.
If you have any questions or bug reports, please send an email to Jicai Jiang at jicai_jiang@ncsu.edu
Jiang, J., O'Connell, J.R., Van Raden, P.M., Ma, L. 2017. A Fast and Flexible Method for Improving Genomic Prediction with Biological Information. Plant and Animal Genome Conference Proceedings. San Diego, CA, Jan. 14-18, P0493.
Last update: 22 November 2020