Verticall pairwise - rrwick/Verticall GitHub Wiki
The verticall pairwise
command is at the core of Verticall's functionality and is part of both the distance tree workflow and alignment tree workflow (see those pages for example commands). See the Pairwise assembly comparison page for a detailed explanation of how it works.
Full help output
usage: verticall pairwise -i IN_DIR -o OUT_FILE [-r REFERENCE] [--window_count WINDOW_COUNT]
[--window_size WINDOW_SIZE] [--ignore_indels]
[--smoothing_factor SMOOTHING_FACTOR] [--secondary SECONDARY] [--verbose]
[--index_options INDEX_OPTIONS] [--align_options ALIGN_OPTIONS]
[--allowed_overlap ALLOWED_OVERLAP] [-t THREADS] [--part PART] [--index_only]
[--skip_check] [--existing_tsv EXISTING_TSV] [-h] [--version]
pairwise analysis of assemblies
Required arguments:
-i IN_DIR, --in_dir IN_DIR Directory containing assemblies in FASTA format
-o OUT_FILE, --out_file OUT_FILE
Filename of TSV output
Reference-based analysis:
-r REFERENCE, --reference REFERENCE
Reference assembly in FASTA format
Settings:
--window_count WINDOW_COUNT Aim to have at least this many comparison windows between assemblies
(default: 50000)
--window_size WINDOW_SIZE Use this defined window size for all pairwise comparisons (default:
dynamically choose window size for each pair)
--ignore_indels Only use mismatches to determine distance (default: use both
mismatches and gap-compressed indels)
--smoothing_factor SMOOTHING_FACTOR
Degree to which the distance distribution is smoothed (default: 0.8)
--secondary SECONDARY Peaks with a mass of at least this fraction of the most massive peak
will be used to produce secondary distances (default: 0.7)
--verbose Output more detail to stderr for debugging (default: only output
basic information)
Alignment:
--index_options INDEX_OPTIONS Minimap2 options for assembly indexing (default: -k15 -w10)
--align_options ALIGN_OPTIONS Minimap2 options for assembly-to-assembly alignment (default: -x
asm20)
--allowed_overlap ALLOWED_OVERLAP
Allow this much overlap between alignments (default: 100)
Performance:
-t THREADS, --threads THREADS CPU threads for parallel processing (default: 10)
--part PART Fraction of the data to analyse (for parallelisation, default: 1/1)
--index_only Quit after building indices (default: continue to pairwise analysis)
--skip_check Do not carry out the assembly check for duplicate contig names and
ambiguous bases (default: perform the assembly check)
--existing_tsv EXISTING_TSV Verticall will skip any assembly pairs present in this existing TSV
file (default: do not skip any pairs)
Other:
-h, --help Show this help message and exit
--version Show program's version number and exit