Lab: Counting Reads - statonlab/EPP_575_RNASeq_Workshop Wiki

Counting Reads

Today we'll use the program HTSeq to count the number of reads mapped to individual genes in our reference genome.

Helpful links:

The general inputs for HTSeq are as follows:


HTSeq allows for highly customizable count approaches for many data types. Depending on the library, you may need to adjust these arguments. If in doubt, read the docs.


1. Load HTSeq using spack

spack load [email protected]%[email protected]

Make sure htseq-count is functional

htseq-count --help

2. Prepare a counts directory

Change to your analysis directory and run the following:

# go to your personal analysis directory

mkdir 4_featureCount
cd 4_featureCount

3. Run HTseq

htseq-count \
	--format=bam \
	--order=pos \
	--stranded=no \
	--type=gene \
	--idattr=ID \
	../3_Read_mapping/SRR17062759.Aligned.sortedByCoord.out.bam \
	/pickett_shared/teaching/EPP575_Jan2022/reference_genome/Athaliana_447_Araport11.gene_exons.gff3 \
	> SRR17062759.counts.txt \
	2> SRR17062759.out