importQC.sh - juanravm/MicroSeqProfiler GitHub Wiki

Importing data and quality control (importQC.sh)

The first script to run in this pipeline workflow is importQC.sh, which makes easier the raw data management and quality control (QC) in QIIME2 software. Prior to the use of this script, it is necessary to activate QIIME2 enviroment with conda activate qiime2 command, as it uses QIIME2 internal commands. You can run this script as shown below:

bash /file_path/importQC.sh \
--input_path /directory_path/to/Rawfastq \
--type SampleData[PairedEndSequencesWithQuality] \
--format CasavaOneEightSingleLanePerSampleDirFmt \
--metadata_fp /file_path/metadata.tsv \
--trim_left 0 \
--trunc_length 250 \
--cores 6

In this code you must specify the next input variables:

--input_path - Path to the directory containing raw fastq.gz files
--type - Type of the input sequences suitable with QIIME2 importing types
--type - Format of the input sequences suitable with QIIME2 importing formats
--metadata_fp - Metadata file path. It should have a first column called "Sample-id" that matches with fastq.gz files
--trim_left - Number of nucleotides removed from the start of the read
--trunc_length - Reads truncation length
--cores - Number of processor available cores to run this program

This script returns:

QC-seq.qza - A file with QC unique sequences in QIIME2 artifact format
QC-table.qza - A file with the counts of each unique sequence in QIIME2 artifact format
QC-stats.qza - A file with QC statistics in QIIME2 artifact format

Make sure you are in your desired working directory in the terminal, as the output files will be placed in your current terminal location.