FCS adaptor input - ncbi/fcs GitHub Wiki
Required inputs for run_fcsadaptor.sh
:
-
Genome sequence file:
--fasta
-
Taxonomy flag corresponding to source organism:
--prok
(prokaryote) or--euk
(eukaryote) -
Usage:
run_fcsadaptor.sh --help
Required inputs for fcs.py clean genome
:
-
Genome sequence file:
--fasta
-
FCS-adaptor report
fcs_adaptor_report.txt
: Final contamination report with contaminant cleaning actions. -
Usage:
python3 fcs.py clean genome --help
Genome sequence file
The genome sequence file should be provided in FASTA format, optionally compressed with gzip. There is currently no support for running FCS-adaptor on FASTQ-formatted reads directly.
Definition lines
Each sequence in the file must have a definition line beginning with '>' and a unique identifier (SeqID), eg >contig001 or >contig002. The SeqIDs should:
- Be less than 50 characters long
- Only include letters, digits, hyphens (-), underscores (_), periods (.), colons (:), asterisks (*), and number signs (#).
- Be unique within a genome
Genome sequences
- All sequences must be more than 10 bp and less than 2 Gbp.
- No sequence should have >50% Ns.