5A. Day 5 : Gene Expression Analysis - bioinfokushwaha/Livestock_Genomics GitHub Wiki

R basics for Gene expression analysis

Arithmetic with R

In its most basic form, R can be used as a simple calculator. Consider the following arithmetic operators:

  • Addition: +
  • Subtraction: -
  • Multiplication: *
  • Division: /
  • Exponentiation: ^
  • Modulo: %%
# An addition
5 + 5 
# A subtraction
5 - 5 
# A multiplication
3 * 5
# A division
(5 + 5) / 2 
# Exponentiation
2^5
# Modulo
28%%6

Variable assignment

A variable allows you to store a value (e.g. 4) or an object (e.g. a function description) in R. You can then later use this variable's name to easily access the value or the object that is stored within this variable. You can assign a value 4 to a variable my_var with the command

my_var <- 4
my_var

Variable Mathematics

Ex-1
my_apples <- 5
my_orange <- 6
my_apples+my_orange

Ex-2
my_apples <- 5 
my_oranges <- "six" 
my_fruit <- my_apples + my_oranges 
my_fruit

Basic data types in R

R works with numerous data types. Some of the most basic types to get started are:

Decimal values like 4.5 are called numerics. i.e. my_numeric <- 42.5
Natural numbers like 4 are called integers. Integers are also numerics. i.e. my_numeric <- 42
Boolean values (TRUE or FALSE) are called logical. my_logical <- FALSE 
Text (or string) values are called characters. my_character <- "universe"
Ex1 Declare variables of different types
my_numeric <- 42
my_character <- "universe"
my_logical <- FALSE 
# Check class of my_numeric
class(my_numeric)
# Check class of my_character
class(my_character)
# Check class of my_logical
class(my_logical)

Note: to check these kind of error use class(my_numeric)

Create a vector

Vectors are one-dimension arrays that can hold numeric data, character data, or logical data. In other words, a vector is a simple tool to store data. In R, you create a vector with the combine function c(). You place the vector elements separated by a comma between the parentheses.

numeric_vector <- c(1, 10, 49)
character_vector <- c("a", "b", "c")

First analysis in R being in Las Vegas

# Poker winnings from Monday to Friday
poker_vector <- c(140, -50, 20, -120, 240)

# Roulette winnings from Monday to Friday
roulette_vector <- c(-24, -50, 100, -350, 10)

Each vector element refers to a day of the week but it is hard to tell which element belongs to which day. It would be nice if you could show that in the vector itself. You can give a name to the elements of a vector with the names() function. Have a look at this example:

Ex1
some_vector <- c("John Doe", "poker player")
names(some_vector) <- c("Name", "Profession")
some_vector

# Assign days as names of poker_vector
names(poker_vector) <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")

Gene expression

For gene expression analysis, we need to install some R libraries which are given below. Copy and paste on the R console

A) Install R (any version) and RStudio

RIntroduction.pdf

working with Rstudio

B) Install the following R packages using install.packages()

  • install.packages("BiocManager")

  • install.packages("tidyverse")

  • install.packages("stringr")

  • install.packages("ggplot2")

C) Install the following R packages using BiocManager

  • BiocManager::install("biomaRt")

  • BiocManager::install("limma")

  • BiocManager::install("edgeR")

  • BiocManager::install("clusterProfiler")

  • BiocManager::install("enrichplot")

  • BiocManager::install("EnrichmentBrowser")

  • BiocManager::install("fgsea")

Gene expression demo and tutorial

Gene Expression Analysis