De novo Assembly - Golob-Minot/geneshot GitHub Wiki
To identify microbial genes, geneshot
will:
- perform de novo assembly of short reads with MEGAHIT,
- identify protein-coding sequences with Prodigal, and
- deduplicate similar gene sequences using MMseqs2
Various flags include:
--phred_offset
: The PHRED offset used by MEGAHIT, default: 33--min_identity
: Amino acid identity cutoff used by MMseqs2 to combine similar genes, default: 90--min_coverage
: Length cutoff used by MMseqs2 to combine similar genes, default: 50