5. G:Profiler - bcb420-2022/Yuzi_Li GitHub Wiki
Objective
Set up my student wiki and repo for BCB420 and get familiar with Docker.
Duration
Expected duration: 1h Actual duration: 1h
Progress
Tasks
- Run G:Profiler on designated set of genes
- Answer questions
Running G:Profiler on genes
- Parameters:
- Selecting ensembl id for genes that map to multiple ensembl ids:
Answering questions
What is the top term returned in each data source?
- GO: biological process: immune system process
- Wiki pathways: Allograft Rejection
- Reactome: Immune System
How many genes are in each of the above genesets returned? (hint, in the Detailed results tab of g:profiler results if you click on the arrows next to the stats heading you will be able to see the number of genes in a term, number of genes in your query and number of genes in your query that are also in your term)
- Immune system process (GO: biological process): 2020 genes in term
- Allograft Rejection (Wiki pathways): 88 genes in term
- Immune System (Reactome): 2041 genes in term
How many genes from our query are found in the above genesets?
- Immune system process (GO: biological process): 409 genes from query
- Allograft Rejection (Wiki pathways): 287 genes from query
- Immune System (Reactome): 334 genes from query
Change g:profiler settings so that you limit the size of the returned genesets. Make sure the returned genesets are between 5 and 200 genes in size. Did that change the results?
- The top terms became more specific for Reactome and GO: positive regulation of leukocyte cell-cell adhesion (GO: biological process), Immunoregulatory interactions between a Lymphoid and a non-Lymphoid cell (Reactome). The top term is the same for Wiki pathways.
- The number of terms included in the term decreased for top terms from Reactome and GO, but the number of terms remains the same for Wiki pathways.
- The number of query genes found in geneset is the same.
- Decreasing the maximum term size makes the term more specific and contains less genes.
Which of the 4 ovarian cancer expression subtypes do you think this list represents?
- This list should represent the immunoreactive subtype because it is associated with differential gene expression mainly in the immunoreactive pathways.
Bonus: The top gene returned for this comparison is TFEC (ensembl gene id:ENSG00000105967). Is it found annotated in any of the pathways returned by g:profiler for our query? What terms is it associated with it g:profiler?
- TFEC is not annotated in any returned pathway.
- TFEC is not associated with any term.
Conclusions and Outlook
- G:Profiler is a good tool for analyzing gene enrichment
- We can find more specific pathways by reducing the maximum term size, and we can find more general pathways by increasing the term sizes