Journal Entry ‐ GESA - bcb420-2024/Anna_Lai GitHub Wiki

GSEA Analysis

Date March 19, 2024

Steps

  1. Download the file MesenvsImmuno_RNASeq_ranks.rnk from GitHub.
  2. Open the GESA Offical Website and download the software for analysis. The version GSEA_Win_4.3.3 was downloaded.
  3. Open the software, go to run GESA Preranked, expand on the basic fields, and change the max size to 200, min size to 15.

Note there is a recent update to the Barder Lab Geneset.

Parameters used:

Errors encountered

After pruning, none of the gene sets passed size thresholds.

<Error Details>

---- Full Error Message ----
Multiple rows mapped to the symbol ''RRM2'. This is not allowed in Remap_only m ...

---- Stack Trace ----
# of exceptions: 1
------Multiple rows mapped to the symbol ''RRM2'. This is not allowed in Remap_only mode.------
xtools.api.param.BadParamException: Multiple rows mapped to the symbol ''RRM2'. This is not allowed in Remap_only mode.
	at org.gsea_msigdb.gsea/edu.mit.broad.genome.alg.DatasetGenerators.collapse(DatasetGenerators.java:194)
	at org.gsea_msigdb.gsea/xtools.gsea.GseaPreranked.getRankedList(GseaPreranked.java:175)
	at org.gsea_msigdb.gsea/xtools.gsea.GseaPreranked.execute(GseaPreranked.java:97)
	at org.gsea_msigdb.gsea/edu.mit.broad.xbench.tui.TaskManager$ToolRunnable.run(TaskManager.java:391)
	at java.base/java.lang.Thread.run(Unknown Source)

Final Result

PDF Report

Questions and Answers

  1. Explain the reasons for using each of the above parameters. We want to limit the result to a reasonable range for inspection - 200 max. 1000 is the default for permutation. The chipset was chosen because the gene name was used as an identifier.
  2. What is the top gene set returned for the Mesenchymal sub type?
  • What is its pvalue, ES, NES and FDR associated with it. Score (ES): 0.8634064, Normalized Enrichment Score (NES): 2.5887244
  • How many genes in its leading edge? 3296 / 5931
  • What is the top gene associated with this geneset. The top two are: HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION, EXTRACELLULAR MATRIX ORGANIZATION (GOBP)
  • 1500 gene sets are significant at FDR < 25%
  • 862 gene sets are significantly enriched at nominal pvalue < 1%
  • 1204 gene sets are significantly enriched at nominal pvalue < 5%
  1. What is the top gene set returned for the Immunoreactive subtype?
  • What is its pvalue, ES, NES and FDR associated with it.
  • How many genes in its leading edge? 2635 / 5931 gene sets
  • What is the top gene associated with this geneset. The top two are: HALLMARK_INTERFERON_ALPHA_RESPONSE, HALLMARK_INTERFERON_GAMMA_RESPONSE
  • 1475 gene sets are significantly enriched at FDR < 25%
  • 811 gene sets are significantly enriched at nominal pvalue < 1%
  • 1173 gene sets are significantly enriched at nominal pvalue < 5%
NAME TITLE SCORE
IGDCC3 immunoglobulin superfamily DCC subclass member 3 [Source:HGNC Symbol;Acc:HGNC:9700] 36.329580
ANTXR1 ANTXR cell adhesion molecule 1 [Source:HGNC Symbol;Acc:HGNC:21014] 35.479492
AEBP1 AE binding protein 1 [Source:HGNC Symbol;Acc:HGNC:303] 33.191590

References (if any)

https://baderlab.github.io/Cytoscape_workflows/EnrichmentMapPipeline/Protocol2_createEM.html https://baderlab.github.io/CBW_Pathways_2023/gsea_mod3.html