Skip to content

Journal Entry Assignment #1: Data set selection and initial Processing

itsSabbir edited this page Apr 15, 2022 · 2 revisions

Table of Contents

Objective

The objective of this assignment is to learn how to choose datasets from GEO2R so that we may use them for analysis and processing as one would do in the field of bioinformatics. After selecting this expression set, we are to retrieve it, map it, normalize it, and finally interpret it based on what we learned in class as well as throughout the assignment.

Time est.: 5 h
Time used: 15 h
Date started: 2022/02/11
Date completed: 2022/03/01

Progress & Notes

I chose to do my assignment on this data set; GSE193417, with the accompanying paper titled Lower Levels of GABAergic Function Markers in Corticotropin-Releasing Hormone-Expressing Neurons in the sgACC of Human Subjects With Depression of which can be found here at this link on and its respective DOI: 10.1016/j.jbc.2022.101652.

I manually chose this dataset by going on GEO here and looking for a topic that interested me. Alternatively, I could have used the code provided in lecture 3 to use R to find a suitable dataset based on parameters that I wanted as well but I thought to try something different so that I could fully appreciate automation. I will however include a rough script that was adapted from the one provided in the lecture, as a means of communicating that I am able to interpret, understand, and modify the code.

I would like to point out that throughout the process of this assignment there were a few problems that I spent a lot of time trying to find the solution to. While most of the answers to these questions and methods were taken from and adapted from lectures, I had to use a number of online resources when I was unable to find a method that worked for my dataset. References to all these materials can be found at the end of this journal page as well as in the rnotebook for this assignment and its associated html file.

I had the most trouble by far on making sure my data set would play nicely. Since it was an excl file and not a txt file it was already a data set the final product was really straightforward since I was already there. It did take me some time but I did figure out how to access the values I needed as well as do some trouble shooting on the files being downloaded improperly or not being found in some cases.

Activates & Tasks

The assignment tasks and requirements can be found here

Conclusion, Outlook, & Discussion

It was a really interesting paper and assignment. I think I have a lot to learn about the subsetting aspect of R and how to make it work for me so that I can be very efficient.

References

R programming for data science: https://bookdown.org/rdpeng/rprogdatascience/data-analysis-case-study-changes-in-fine-particle-air-pollution-in-the-u-s-.html

Lecture modules: https://q.utoronto.ca/courses/248455/modules

Oh, Hyunjung & Newton, Dwight & Lewis, David & Sibille, Etienne. (2022). Lower Levels of GABAergic Function Markers in Corticotropin-Releasing Hormone-Expressing Neurons in the sgACC of Human Subjects With Depression. Frontiers in Psychiatry. 13. 10.3389/fpsyt.2022.827972.