Verticall pairwise - rrwick/Verticall GitHub Wiki

The verticall pairwise command is at the core of Verticall's functionality and is part of both the distance tree workflow and alignment tree workflow (see those pages for example commands). See the Pairwise assembly comparison page for a detailed explanation of how it works.

Full help output

usage: verticall pairwise -i IN_DIR -o OUT_FILE [-r REFERENCE] [--window_count WINDOW_COUNT]
                          [--window_size WINDOW_SIZE] [--ignore_indels]
                          [--smoothing_factor SMOOTHING_FACTOR] [--secondary SECONDARY] [--verbose]
                          [--index_options INDEX_OPTIONS] [--align_options ALIGN_OPTIONS]
                          [--allowed_overlap ALLOWED_OVERLAP] [-t THREADS] [--part PART] [--index_only]
                          [--skip_check] [--existing_tsv EXISTING_TSV] [-h] [--version]

pairwise analysis of assemblies

Required arguments:
  -i IN_DIR, --in_dir IN_DIR       Directory containing assemblies in FASTA format
  -o OUT_FILE, --out_file OUT_FILE
                                   Filename of TSV output

Reference-based analysis:
  -r REFERENCE, --reference REFERENCE
                                   Reference assembly in FASTA format

Settings:
  --window_count WINDOW_COUNT      Aim to have at least this many comparison windows between assemblies
                                   (default: 50000)
  --window_size WINDOW_SIZE        Use this defined window size for all pairwise comparisons (default:
                                   dynamically choose window size for each pair)
  --ignore_indels                  Only use mismatches to determine distance (default: use both
                                   mismatches and gap-compressed indels)
  --smoothing_factor SMOOTHING_FACTOR
                                   Degree to which the distance distribution is smoothed (default: 0.8)
  --secondary SECONDARY            Peaks with a mass of at least this fraction of the most massive peak
                                   will be used to produce secondary distances (default: 0.7)
  --verbose                        Output more detail to stderr for debugging (default: only output
                                   basic information)

Alignment:
  --index_options INDEX_OPTIONS    Minimap2 options for assembly indexing (default: -k15 -w10)
  --align_options ALIGN_OPTIONS    Minimap2 options for assembly-to-assembly alignment (default: -x
                                   asm20)
  --allowed_overlap ALLOWED_OVERLAP
                                   Allow this much overlap between alignments (default: 100)

Performance:
  -t THREADS, --threads THREADS    CPU threads for parallel processing (default: 10)
  --part PART                      Fraction of the data to analyse (for parallelisation, default: 1/1)
  --index_only                     Quit after building indices (default: continue to pairwise analysis)
  --skip_check                     Do not carry out the assembly check for duplicate contig names and
                                   ambiguous bases (default: perform the assembly check)
  --existing_tsv EXISTING_TSV      Verticall will skip any assembly pairs present in this existing TSV
                                   file (default: do not skip any pairs)

Other:
  -h, --help                       Show this help message and exit
  --version                        Show program's version number and exit