Normalisation - NBISweden/workshop-genome_assembly GitHub Wiki

BBNorm: Normalisation

Notes:

  • Normalisation reduces k-mers of high coverage. Default threshold is 100x k-mer coverage.

Command:

#!/usr/bin/env bash

module load bioinfo-tools bbmap

CPUS="${SLURM_NPROCS:-8}"
JOB=$SLURM_ARRAY_TASK_ID

DATA_DIR=/path/to/reads
FILES=( $DATA_DIR/*_R1.fastq.gz )

apply_bbnorm () {
	READ1="$1"	# Read 1 of the read pair to be screened
	READ2="$2"	# Read 2 of the read pair to be screened
	PREFIX=$(basename "${READ1%_R1*}")
	bbnorm.sh t="$CPUS" in="$READ1" in2="$READ2" out="${PREFIX}_bbnormalised_R1.fastq.gz" out2="${PREFIX}_bbnormalised_R2.fastq.gz" tempdir="${SNIC_TMP:-$TMP}"
}

FASTQ="${FILES[$JOB]}"
apply_bbnorm "$FASTQ" "${FASTQ/_R1./_R2.}"