Entry 14: G:Profiler Assignment - bcb420-2025/Izumi_Ando GitHub Wiki
🧰 Tool : https://biit.cs.ut.ee/gprofiler/gost
👀 First 3 questions are answered in the table.
1. What is the top term returned in each data source?
2. How many genes are in each of the above genesets returned? (hint, in the Detailed results tab of g:profiler results if you click on the arrows next to the stats heading you will be able to see the number of genes in a term, number of genes in your query and number of genes in your query that are also in your term)
3. How many genes from our query are found in the above genesets?
GO:BP | Reactome | Wiki Pathways | |
---|---|---|---|
Term | immune response (GO:0006955) | immune system (REAC:R-HSA-168256) | allograft rejection (WP:WP2328) |
num genes in geneset | 2008 | 2079 | 88 |
num genes from query in geneset | 258 | 222 | 32 |
4. Change g:profiler settings so that you limit the size of the returned genesets. Make sure the returned genesets are between 5 and 200 genes in size. Did that change the results?
Yes, it changed the results. The top hits became more specific in terminology. For example, the first screenshot below shows the top hits without limiting the size of the returned genesets and the second screenshot shows the results when I limited geneset size. Although the top hit for GO:BP and Reactome datasources were different from the original results, the Wiki Pathways top hip was the same.
.
Figure 1: top hits for search without geneset size restrictions.
Figure 2: top hits for search with 5-200 geneset size restriction
5. Which of the 4 ovarian cancer expression subtypes do you think this list represents?
The four ovarian cancer expression subtypes are D, differentiated; I, immunoreactive; M, mesenchymal; P, proliferative. Based on scanning the top results for key words related to each one, I think this list represents the immunoreactive subtype (multiple hits include words such as "immun~" "T cell" "cytokine").
Source: The Cancer Genome Atlas Research Network. Integrated genomic analyses of ovarian carcinoma. Nature 474, 609–615 (2011). https://doi.org/10.1038/nature10166
6. Bonus: The top gene returned for this comparison is TFEC (ensembl gene id:ENSG00000105967). Is it found annotated in any of the pathways returned by g:profiler for our query? What terms is it associated with it g:profiler?
In the first page of the results, there were 4 terms from GO:BP that inluced TFEC.
response to stimulus (GO:0050896).
response to stress (GO:0006950).
cellular response to stimulus (GO:0051716).
positive regulation of biological process (GO:0048518).