170103 Limma - npslindstrom/DE-analysis GitHub Wiki

Today I have written an R-script for running limma for differential expression on a RNA-seq counts-table. I have followed the instructions as detailed in the limma users guide, which was very informative. I am doing this since Linnea has managed to do the alignments on the data.

The script is detailed below and is run using the source command in R. Some details in the script have to be changed before running it on the actual data since I wrote it to run on the data generated by tophat since I had some trouble accessing the counts aligned with STAR.

library(limma)

library(edgeR)

data_orig = read.table("/glob/nilsl/lncRNA_proj/tophat_counts/count_table.txt", header = TRUE)

data = data_orig

data = data[,-5:-18]

group <- factor(c(1,1,2,2))

design = model.matrix(~ group)

dge <- DGEList(counts = data)

dge <- calcNormFactors(dge)

logCPM <- cpm(dge, log=TRUE, prior.count=3)

fit <- lmFit(logCPM, design)

fit <- eBayes(fit, trend=TRUE)

topTable(fit, coef=ncol(design))