Preprocessing WGS Data - Golob-Minot/geneshot GitHub Wiki
By default, geneshot
will perform preprocessing on the raw paired-end FASTQ datasets. This consists of:
- Optionally running
barcodecop
to ensure that the samples were demultiplexed correctly (if index reads are provided in theI1
andI2
columns of the manifest) - Trimming adapters using
cutadapt
(adapter sequences can be manually specified using the--adapter_F
and--adapter_R
flags) - Removing reads which align to the human genome (defaults to the current human genome, but can be customized with
--hg_index_url
)
The entire preprocessing suite of tasks can be skipped with the --nopreprocess
flag.