Journal Entry ‐ G:Profiler - bcb420-2024/Anna_Lai GitHub Wiki
G:Profiler
Date March 05, 2024
Steps
- Go to G:Profiler and enter the gene set given. Parameters used:
- Data sources : Reactome, Go biologoical process, and Wiki pathways
- Significance threshold - Benjamini hochberg FDR
- Observe the result and answer the questions.
Answers
What is the top term returned in each data source?
- Go Biological process returned immune system process as the top term.
- Reactome returned immune system as the top term.
- Wiki Pathways returned TYROBP causal network in microglia as the top term.
How many genes are in each of the above genesets returned?
- Go Biological process: 287 out of 425 genes found are in the top term.
- Reactome: 218 out of 332 genes found are in the top term.
- Wiki Pathways: 30 out of 294 genes found are in the top term.
How many genes from our query are found in the above genesets?
- Go Biological process: There are 21031 genes in the dataset in total.
- Reactome: There are 10842 genes in the dataset in total.
- Wiki Pathways: There are 8286 genes in the dataset in total.
Change g:profiler settings so that you limit the size of the returned genesets. Make sure the returned genesets are between 5 and 200 genes in size. Did that change the results?
After changing the term size and limiting them to 5 - 200:
- Go Biological process: The top term is now antigen processing and presentation.
- Reactome: The top term is now Immunoregulatory interactions between a Lymphoid and a non-Lymphoid cell.
- Wiki Pathways: The top term is now TYROBP causal network in microglia.
The query length is 485, seems like not one database can identify all the genes input.
Which of the 4 ovarian cancer expression subtypes do you think this list represents?
Out of the 4 types: (Tothill et al) 1. an immunoreactive expression subtype associated with infiltration of immune cells, 2. a low stromal expression subtype with high levels of circulating CA125, 3. a poor prognosis subtype displaying strong stromal response, correlating with extensive desmoplasia, and 4. a mesenchymal subtype with high expression of N/P-cadherins.
I believe this set of genes is tied to the first type, the immunoreactive expression subtype of ovarian cancer.
Bonus: The top gene returned for this comparison is TFEC (ensembl gene id:ENSG00000105967). Is it found annotated in any of the pathways returned by g:profiler for our query? What terms is it associated with it g:profiler?
Yes, TFEC was found in our query. In GO:Biological process, it is associated with the Go term immune system process, response to stimulus, response to stress, cellular response to stimulus, positive regulation of biological process, and regulation of leukocyte mediated cytotoxicity. No annotation term was found in the other two databases for TFEC.
References and citations (if any)
g:Profiler. (Version e111_eg58_p18_30541362). Retrieved from https://biit.cs.ut.ee/gprofiler/gost.
Tothill RW, Tinker AV, George J, Brown R, Fox SB, Lade S, Johnson DS, Trivett MK, Etemadmoghadam D, Locandro B, Traficante N, Fereday S, Hung JA, Chiew YE, Haviv I; Australian Ovarian Cancer Study Group; Gertig D, DeFazio A, Bowtell DD. Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome. Clin Cancer Res. 2008 Aug 15;14(16):5198-208. doi: 10.1158/1078-0432.CCR-08-0196. PMID: 18698038.