Intro Gene Level - ccsstudentmentors/tutorials GitHub Wiki
So you have chosen to work with your data at the gene level rather than the isoform level? Great choice! The interpretation of your results will be far more straightforward and will require much less difficult validation than if you had ventured down to the isoform level. (Although we appreciate that sometimes you simply have to go down the isoform analysis path)
There are many tools to perform Gene-Level analyses of RNA-Seq data, but I will just cover two of the most popular options here: edgeR and DESeq2.
Here is a basic overview of what Gene-Level analysis entails:
-
Sort your BAM file of aligned reads (we will use samtools sort)
-
Count how many reads were aligned to each gene in your genomic annotation (we will use HTSeq-count)
-
Calculate the differentially expressed genes for your dataset using either edgeR or DESeq2 (or any of the many other programs that I won't discuss)
So, get started by heading to the section on Sorting Aligned Reads.