PreAlignment QC - Bioinformatics-Institute/transcriptomics_WBC GitHub Wiki

RNA-seq Flowchart - Module 2

1-iv. Pre-Alignment QC

You can use FastQC to get a sense of your data quality before alignment:

Video Tutorial here:

Make an output directory, run FastQC on a fastq file, and view the outputs in Firefox:

    cd $RNAWORKING
    mkdir fastqc
    fastqc $RNARAW/HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz -outdir fastqc/
    firefox fastqc/HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1_fastqc.html

Another useful QC program is SolexaQA++ (which can also do various trimming steps).

To obtain help on the program and a list of options on the command-line, enter:

# SolexaQA++ and subprograms help
SolexaQA++

SolexaQA++ analysis

SolexaQA++ dynamictrim

SolexaQA++ lengthsort

# Fastqc help
fastqc --help

Make an output directory, run SolexaQA++ on a fastq file, and view some PDF outputs in a file browser window:

    mkdir solexaqa
    SolexaQA++ analysis $RNARAW/HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz -d solexaqa
    gnome-open solexaqa

PRACTICAL EXERCISE 3

Assignment: Run FASTQC or SolexaQA++ on one of the additional fastq files.

  • Hint: Remember that this data is stored as read-only compressed files in $RNARAW.
  • Hint: Both FASTQC and SolexaQA++ can run on compressed (.gz) files.
  • Hint: To get help on using these programs, try
    SolexaQA++ help
    fastqc -help
  • Hint: You can also simply run:
   fastqc

This will pop up the GUI (Graphical User Interface) version of fastqc. In the GUI, you can select any files, run fastqc and view the report.


Previous Section This Section Next Section
Data Data QC Preprocessing