i-GSEA4GWAS Home

Method: i-GSEA

The i-GSEA4GWAS web server implements i-GSEA (improved gene set enrichment analysis) to help researchers explore GWAS data efficiently. i-GSEA is an implementation and extension of the original GSEA for GWAS. The key steps of i-GSEA are the same as GSEA with two highlights: 1) i-GSEA implements SNP label permutation instead of phenotype label permutation to adapt GWAS SNP P-values and to correct gene and gene set variation; 2) i-GSEA multiplies a significance proportion ratio factor to the ES to get the significant proportion based enrichment score (SPES) as described in details below.

Briefly, firstly following the classical GSEA for gene expression study [1] and GSEA for GWAS [2], the maximum -log(P-value) or statistics of all the SNPs mapped to a gene was used to represent the gene (t). Then for N genes presented in GWAS, we ranked the genes by decreasing t₍₁₎ , t₍₂₎ ,¡, t_(i) ,..., t_(N). For each given gene set S with set size N_s, the enrichment score, ES(S), with parameter w = 1 is calculated:

(1)

ES(S) emphasizes on the added-up significance of the top genes in S. high ES(S) indicates the association signal in S is highly concentrated at the top of the ranked gene list. Then the key step is: a significant cutoff t₀ for the genes mapped with at least one of the top 5% of all SNPs is considered. Instead of ES(S), a significance proportion based enrichment score, SPES(S), is expressed as:

(2)

Where m is the number of genes in gene set S, n is the number of all genes in gene set S; M is the number of genes with t > t₀ in the GWAS and N is the number of all the genes in the GWAS. SPES emphasizes on the proportion of significant genes in gene set S to avoid the high scoring caused by very few genes with extremely high significance. The following steps, variant label permutation, normalization, calculating gene set P value and FDR, are the same as the classical GSEA for GWAS [2].

As an application, we performed i-GSEA on the P-value data of the GWAS for host control of HIV-1 [3], which has a follow-up study containing the replication of top SNPs and a classical GSEA analysis with phenotype label permutation [4] .By searching database of canonical pathways, i-GSEA identified 5 pathways, three of which were confirmed by our publication [4], and two of which have references to support [5-7]; while by using the classical GSEA, only two pathways were obtained, two of which was confirmed by our publication. By searching database of canonical pathways + GO terms, i-GSEA identified 4 pathways/GO terms and no findings from classical GSEA. This result shows that i-GSEA has the improved sensitivity (Table 1).

Table 1 The significant pathways / GO terms (FDR < 0.05) obtained respectively by GSEA and i-GSEA.

The pathways/GO terms in dark are confirmed by [4].
^* The two pathways have references to support [5-7].
¹ By searching canonical pathways
² By searching the combination of canonical pathways and GO terms.

References
[1] Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, and Mesirov JP, Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. 2005. Proc Natl Acad Sci U S A 102 (43) 15545-15550.
[2] Wang K, Li M, and Bucan M, Pathway-Based Approaches for Analysis of Genomewide Association Studies. 2007. Am J Hum Genet 81 (6) 1278¨C1283.
[3] Fellay J, et al., A whole-genome association study of major determinants for host control of HIV-1. 2007. Science 317 (5840) 944-947. (Originally published in Science Express on 19 July 2007)
[4] Fellay J, et al., Common Genetic Variation and the Control of HIV-1 in Humans. 2009. PLoS Genet 5(12) e1000791.
[5] Brierley I and Dos Ramos FJ. Programmed ribosomal frameshifting in HIV-1 and the SARS-CoV. 2006. Virus Res 119 (1), 29-42.
[6] Manninen A and Saksela K, HIV-1 Nef interacts with inositol trisphosphate receptor to activate calcium signaling in T cells. 2002. J Exp Med 195 (8) 1023-1032.
[7] Mayne M et al., Release of calcium from inositol 1,4,5-trisphosphate receptor-regulated stores by HIV-1 Tat regulates TNF-alpha production in human macrophages. 2000. J Immunol 164 (12) 6538-6542.

JSP Page