Lab 02: QC - ryandkuster/EPP_575_RNA_25 GitHub Wiki

Software

FastQC Website

MultiQC Website

Quality assessment of read files

Symbolic links

When multiplied by the number of students in the workshop, these files are big, and unnecessarily copying each file will use up too much storage space for our system. Rather than copying files to your directory, create a symbolic link.

Navigate back to /lustre/isaac24/proj/UTK0386/analysis/<your_username>.

Within this directory, create a sub-directory to hold the first step of our analysis:

cd $RNA
mkdir 02_fastqc
cd 02_fastqc

Now, run the command:

ln -s /lustre/isaac24/proj/UTK0386/data/raw/*fastq.gz .

This creates a symbolic link to the fastq files; rather than creating a hard duplicate, this command creates a different type of file that points to the original file. Use ls -lh to see your current directory contents.

FastQC

FastQC can be loaded using the module command on ISAAC-NG with the following command:

module load fastqc/0.11.9

Test that fastqc loaded properly for you. What message pops up if you just run fastqc? How about fastqc -h?

To run fastqc on your data, run the following:

mkdir fastqc_output
fastqc -o fastqc_output -t 2 Col_0h_rep1_1.fastq.gz Col_0h_rep1_2.fastq.gz

This creates an HTML file that is unable to be viewed on Terminal. Using the scp command from your own device, copy this file to your personal computer to open the HTML file for viewing. (Hint: you may need to use pwd to find the path to your file)

Viewing server files

There are two common ways we can access files, Open OnDemand and scp. For now we'll use Open OnDemand. Once again navigate to Open OnDemand and click on Files, then Home Directory.

Screenshot 2025-05-22 at 1 20 24 PM

Now you can navigate to the directory where you have the fastqc output (use pwd to see where you are). You can click on Change Directory and enter this path. It should be /lustre/isaac24/proj/UTK0386/analysis/< you user name >/02_fastqc

Screenshot 2025-05-22 at 1 46 31 PM

Now download the file to your computer:

Screenshot 2025-05-22 at 1 49 19 PM
Copy files using scp
scp -r <your_username>@dtn2.isaac.utk.edu:/lustre/isaac24/proj/UTK0386/analysis/<your_username>/02_fastqc/fastqc_output/*html .

⚠️ If you're on a mac, you may receive a no matches found error, which means you'll need to put an escape character (\ backslash) before wildcards (* asterisk) for the command to work.

Challenge

We have performed quality assessment on a pair of read files for sample Col_0h_rep1. Repeat this for another pair of files.

MultiQC

Once you have both FastQC html files, we can run MultiQC to aggregate our results. Load it with the following commands:

module load anaconda3/2024.06
conda activate /lustre/isaac24/proj/UTK0386/conda/multiqc

Note

Using conda for the first time may require you to run conda init, followed by source ~/.bashrc to use it.

In the same directory you ran FastQC, run the following command:

multiqc ./fastqc_output

What is the importance of the . in this command?

Once it has finished running, you will have a file in your 01_fastqc directory named multiqc_report.html. This is the default file name of every run of MultiQC. If time permits, download it with Open OnDemand or scp the multiqc report to your desktop and open it up.

To deactivate the multiqc conda environment, use:

conda deactivate
⚠️ **GitHub.com Fallback** ⚠️