GSA-SNP2 is successor of GSA-SNP (Nam et al. 2010, NAR web server issue). GSA-SNP2 accepts human GWAS summary data (rs numbers, p-values) or gene-wise p-values (possibly obtained from VEGAS or GATES) and outputs pathway gene sets ‘enriched’ with genes associated with the given phenotype.
Nucleic Acids Research article: "Efficient pathway enrichment and network analysis of GWAS summary data using GSA-SNP2"
Sourceforge site: https://sourceforge.net/projects/gsasnp2
Five features of GSA-SNP2:
A) Gene scores are ‘adjusted’ to the number of SNPs assigned to each gene using monotone cubic spline trend curve.
B) Adjacent genes with high inter-gene correlations within each pathway were removed.
Why competitive pathway analysis for GWAS data? Competitive methods directly target the pathway-level aberrations by testing the ‘enrichment’ of the associated genes within each pathway set. On the other hand, self-contained methods test the ‘existence’ of the associated gene(s) therein. Self-contained methods are in general highly sensitive, so are useful in finding novel pathways. However, genes typically have multiple functions and the mere existence of the association gene(s) does not necessarily imply the pathway-level aberration. So, both the approaches are useful and complementary to each other. Unfortunately, many competitive methods for GWAS data so far exhibited low powers and were susceptible to some free parameters.
Comparison with other competitive methods
Performance of GSA-SNP2 was compared with those of five existing competitive methods (GSA-SNP, iGSEA4GWAS MAGENTA, INRICH and GOWINDA) and one self-contained method (sARTP). GSA-SNP2 was a little liberal in the false positive control compared to others, but exhibited high power and the best discriminatory ability of the gold standard positive pathways.
A) Comparison of type I error control: twenty null genotype data sets were generated using 1000 genome (European) and GWAsimulator tool and corresponding p-values were input to each program. GSA-SNP2 exhibited greatly improved type I error control compared to GSA-SNP.
B) Power comparison: DIAGRAM consortium GWAS p-values (European) were used to compare the statistical power. 16 curated T2D related pathways (Morris et al. Nat. Genetics 2012) as well as the terms including ‘diabetes’ were regarded as gold standard (GS) positive pathways. GSA-SNP2 exhibited high power and the best ranks of GS pathways. See the results here.
GSA-SNP2 also provides global networks of associated genes. Core sub-networks in each of the 122 significant pathways (FDR<=0.25) of the DIAGRAM data are aggregated to a global network (STRING data, adjusted gene p-value<=0.001). This global structure cannot be represented by any of the single pathways (at maximum five associated genes are included in a single pathway). PPARG and TNF are represented as hub proteins.
Contact: hainct@unist.ac.kr (Dr. HCT Nguyen), dougnam@unist.ac.kr (Dr. Dougu Nam)
Any feedback or comments are greatly appreciated!!
January-2017