Preprocessing WGS Data - Golob-Minot/geneshot GitHub Wiki
By default, geneshot will perform preprocessing on the raw paired-end FASTQ datasets. This consists of:
- Optionally running
barcodecopto ensure that the samples were demultiplexed correctly (if index reads are provided in theI1andI2columns of the manifest) - Trimming adapters using
cutadapt(adapter sequences can be manually specified using the--adapter_Fand--adapter_Rflags) - Removing reads which align to the human genome (defaults to the current human genome, but can be customized with
--hg_index_url)
The entire preprocessing suite of tasks can be skipped with the --nopreprocess flag.