Entry 17: GSEA Assignment - bcb420-2025/Izumi_Ando GitHub Wiki

⏰ Expected vs Actual time taken - 1 hr vs 2 hrs

For this assignment, I downloaded the GSEA v4.4.0 Java app for MacOS (silicon) to run the GSEAPreranked analysis.

Parameter / Input	Selection
Ranked List	provided mesenchymal vs immuno rank
Geneset *	Human_GOBP_AllPathways_noPFOCR_no_GO_iea_March_01_2025_symbol.gmt
Max Geneset Size	200
Min Geneset Size	15
Number of permutations	1000 (Default)
Collapse / Remap to symbols	No_Collapse
Enrichment Statistic	weighted (default)

1. Explain the reasons for using each of the above parameters. (geneset, max & min geneset size, permutations)

As for the geneset, we selected the set with gene symbols to align with the rank list, and the one without IEA because it is a better, human curated list. I used the March 01, 2025 version because the Jan 04 2025 version yielded no results.
As for the max gene set size, 200 was selected as we noted in the g:profiler assignment that gene sets with larger sizes tended to be less specific.
As for the min gene set size, 15 was selected as anything lower may either be too specific and may increase compute time.
As for number of permutations, the default value 1000 was selected. As it already took some time, I did not consider trying it with a larger number.

The analysis took a few minutes to run, results were displayed on an html output

Top Geneset	HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION%MSIGDBHALLMARK%HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION
Associated pvalue	0.000
Associated ES	0.86
Associated NES	2.57
Associated FDR	0.000
Number of genes in leading edge (# core genes)	81
Top gene	FBN1

Top Geneset	HALLMARK_INTERFERON_ALPHA_RESPONSE%MSIGDBHALLMARK%HALLMARK_INTERFERON_ALPHA_RESPONSE
Associated pvalue	0.000
Associated ES	-0.86
Associated NES	-2.92
Associated FDR	0.000
Number of genes in leading edge (# core genes)	58
Top gene	PROCR

Number of genes in leading edge are NOT under the "size" column in the results html, its the number of genes with the "Yes" label the under core enrichment column in the individual geneset page (screenshot below)