Quality Control - undiagnosed/metagenomics GitHub Wiki
Analysis
FastQC allows you to quickly assess the quality of the sequencing data you're working with.
Trimming tools
Trimmomatic
There are many different trimming tools available, all with different strengths and weaknesses. The following example demonstrates usage of the Trimmomatic trimmer for paired-end reads. One nice feature of Trimmomatic is that it can work directly with compressed paired-end reads.
java -jar trimmomatic-0.36.jar PE R1.fastq.gz R2.fastq.gz R1_trimmed_paired.fastq.gz R1_trimmed_unpaired.fastq.gz R2_trimmed_paired.fastq.gz R2_trimmed_unpaired.fastq.gz LEADING:20 TRAILING:20 SLIDINGWINDOW:4:20 MINLEN:50
Prinseq
Prinseq is a popular tool with more filtering options that Trimmomatic. One downside is that it does not work with compressed paired-end fastq files. The files must first be decompressed which required more storage space. The following example demonstrates usage.
prinseq-lite.pl -fastq R1.fastq -fastq2 R2.fastq -min_qual_mean 20 -ns_max_p 20 -derep 1 -trim_qual_right 20 -min_len 30 -out_format 3 -out_good filtered_data -out_bad discarded_data