||
http://software.broadinstitute.org/gsea/downloads.jsp
1. 下载命令行版本
2. 下载测试数据
http://software.broadinstitute.org/gsea/datasets.jsp
-rw-rw-r--. 1 zhanghl zhanghl 5073 7月 4 16:25 c2.cp.kegg.v5.1.entrez.gmt
-rw-rw-r--. 1 zhanghl zhanghl 5073 7月 4 16:25 c2.cp.kegg.v5.1.orig.gmt
-rw-rw-r--. 1 zhanghl zhanghl 5073 7月 4 16:23 c2.cp.kegg.v5.1.symbols.gmt
-rw-rw-r--. 1 zhanghl zhanghl 5073 7月 4 15:52 gsea2-2.2.2.jar
-rw-rw-r--. 1 zhanghl zhanghl 735526 7月 4 16:21 HG_U95Av2.chip
-rw-rw-r--. 1 zhanghl zhanghl 5073 7月 4 16:01 P53.cls
-rw-rw-r--. 1 zhanghl zhanghl 5073 7月 4 16:00 P53_collapsed_symbols.gct
-rw-rw-r--. 1 zhanghl zhanghl 5073 7月 4 16:00 P53_hgu95av2.gct 3.
3. 支撑文件
Chip-annotaion ftp://ftp.broadinstitute.org/pub/gsea/annotations/
Molecular Signatures Database(MSigDB) http://software.broadinstitute.org/gsea/downloads.jsp
4. 运行
结构:java -cp full-path/gsea2.jar -Xmx512m gsea-tool parameters
java -cp /home/zhanghl/gsea/gsea2-2.2.2.jar
-Xmx1024m xtools.gsea.Gsea
-gmx c2.cp.kegg.v5.1.symbols.gmt
-res P53_hgu95av2.gct
-cls P53.cls
-chip ./HG_U95Av2.chip
-out result
-rpt_label p53
官网代码:
http://software.broadinstitute.org/gsea/doc/GSEAUserGuideFrame.html?_Preparing_Data_Files
Interpreting GSEA ResultsEnrichment Score (ES)
The primary result of the gene set enrichment analysis is theenrichment score (ES), which reflects thedegree to which a gene set is overrepresented at the top or bottom of a rankedlist of genes. GSEA calculates the ES by walking down the ranked list ofgenes, increasing a running-sum statistic when a gene is in the gene set anddecreasing it when it is not. The magnitude of the increment depends on thecorrelation of the gene with the phenotype. The ES is the maximum deviationfrom zero encountered in walking the list. A positive ES indicates gene setenrichment at the top of the ranked list; a negative ES indicates gene setenrichment at the bottom of the ranked list.
In the analysis results, the enrichment plot provides a graphicalview of the enrichment score for a gene set:
● The top portion of the plot shows the running ES forthe gene set as the analysis walks down the ranked list. The score at the peakof the plot (the score furthest from 0.0) is the ES for the gene set. Gene setswith a distinct peak at the beginning (such as the one shown here) or end ofthe ranked list are generally the most interesting.
● The middle portion of the plot shows where the membersof the gene set appear in the ranked list of genes.
Theleading edge subset of a gene set is the subset of members thatcontribute most to the ES. For a positive ES (such as the one shown here), theleading edge subset is the set of members that appear in the ranked list priorto the peak score. For a negative ES, it is the set of members that appearsubsequent to the peak score.
● The bottom portion of the plot shows the value of theranking metric as you move down the list of ranked genes. The ranking metricmeasures a gene’s correlation with a phenotype. The value of the ranking metricgoes from positive to negative as you move down the ranked list. A positivevalue indicates correlation with the first phenotype and a negative valueindicates correlation with the second phenotype. For continuous phenotypes(time series or gene of interest), a positive value indicates correlation with thephenotype profile and a negative value indicates no correlation or inversecorrelation with the profile.
Note: By default, the ranking metric isthe signal-to-noise ratio. To have GSEA rank the genes based on a differentmetric, use the Metric for ranking genes parameter of the RunGSEA Page. To have GSEA analyze a ranked list of genes that you havecreated, use the GSEAPrerankedPage.
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-5-20 17:01
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社