Home - SabaLab/RNASeq_Scripts GitHub Wiki

Welcome to the RNASeq_Scripts wiki!

Here you will find documentation for the various scripts to process RNA-Seq data.

Most scripts assume the following directory structure

/.../batch/  
           rawReads  
           trimmedReads  
               v1  
           cleanedReads  
               rn6.v1  
           alignedReads  
               rn6.v1  

TotalRNA

Outline TotalRNA Commands

  1. Trimming - Trim reads for quality and adapter sequence (optionally count raw reads) - cutadapt
  2. Cleaning - Align to rRNA to remove rRNA - bowtie2
  3. Genome Alignment - Align either trimmed or cleaned reads to genome using either strain specific genomes or a generic reference - hisat2
  4. RSEM - Align and quantitate a transcriptome - RSEM

SmallRNA

Outline SmallRNA Commands

  1. Small RNA Alignmet - a series of steps to align to strain specific smallRNA features seperated into miRNA, snoRNA, miscSmallRNA, miscLargeRNA and aligned in that order by aligning unaligned reads from the preceeding step - bowtie2

Misc Scripts

  • Stranded bamToBigWig - convert bam files to stranded bigWig files
  • zip files - zip a folder of files with a specific suffix, runs N files in parallel.
  • bamToFastQ - bamToFastQ conversion while removing unpaired reads - called during the cleaning step to generate unmapped.end1/2.fq.gz files for use in the remainder of the pipeline/tools.
  • count zipped fastq - script to automatically count lines in all zipped fastq within a folder - leaving all the files zipped.