Normalisation - NBISweden/workshop-genome_assembly GitHub Wiki
BBNorm: Normalisation
Notes:
- Normalisation reduces k-mers of high coverage. Default threshold is 100x k-mer coverage.
Command:
#!/usr/bin/env bash
module load bioinfo-tools bbmap
CPUS="${SLURM_NPROCS:-8}"
JOB=$SLURM_ARRAY_TASK_ID
DATA_DIR=/path/to/reads
FILES=( $DATA_DIR/*_R1.fastq.gz )
apply_bbnorm () {
READ1="$1" # Read 1 of the read pair to be screened
READ2="$2" # Read 2 of the read pair to be screened
PREFIX=$(basename "${READ1%_R1*}")
bbnorm.sh t="$CPUS" in="$READ1" in2="$READ2" out="${PREFIX}_bbnormalised_R1.fastq.gz" out2="${PREFIX}_bbnormalised_R2.fastq.gz" tempdir="${SNIC_TMP:-$TMP}"
}
FASTQ="${FILES[$JOB]}"
apply_bbnorm "$FASTQ" "${FASTQ/_R1./_R2.}"