Verticall view - rrwick/Verticall GitHub Wiki
The verticall view
command is an optional step to visualise some of Verticall's inner workings using interactive plots. It operates on a single assembly pair. This step is not required for any workflow, but it can help you to understand exactly why Verticall behaved the way it did for a particular pair of assemblies. When run, verticall view
produces four plots, examples of which are shown below.
Note that if you are running Verticall on a remote server (e.g. connecting via ssh), then verticall view
might not be able to display the plots (read more here). The easiest option is to copy the assemblies to your computer and run verticall view
locally.
Verticall view must be given a directory of assemblies and the names of which two assemblies will be compared.
verticall view -i assemblies --names a,b
Figure 1 shows the distance distribution coloured by the thresholds: vertical (tlow ≤ d ≤ thigh) is blue, ambiguous (tv-low ≤ d < tlow or thigh < d ≤ tv-high) is grey and horizontal (d < tv-low or tv-high < d) is red. It also shows the smoothed distribution (solid line), the mean of the full distribution (dotted vertical line) and median of the full distribution (dashed vertical line). See the Pairwise assembly comparison page for more details.
Figure 2 shows the same distribution but after ambiguous regions have been resolved to either vertical or horizontal. The mean (dotted line) and median (dashed line) in this figure are for the vertical-only component of the distribution.
Figure 3 shows the painted alignments. Solid vertical lines indicate the boundary between alignments, and the background colour indicated the classification (blue for vertical, red for horizontal).
Figure 3 shows the painted contigs. It is similar to Figure 3, except each section (separated by solid vertical lines) represents a contig in the first of the two assemblies. There is also a third possible classification, unaligned (white), for where the two genomes did not align.
usage: verticall view -i IN_DIR -n NAMES [--window_count WINDOW_COUNT] [--window_size WINDOW_SIZE]
[--ignore_indels] [--smoothing_factor SMOOTHING_FACTOR] [--secondary SECONDARY]
[--verbose] [--index_options INDEX_OPTIONS] [--align_options ALIGN_OPTIONS]
[--allowed_overlap ALLOWED_OVERLAP] [--sqrt_distance] [--sqrt_mass]
[--result RESULT] [--vertical_colour VERTICAL_COLOUR]
[--horizontal_colour HORIZONTAL_COLOUR] [--ambiguous_colour AMBIGUOUS_COLOUR]
[-h] [--version]
view plots for a single assembly pair
Required arguments:
-i IN_DIR, --in_dir IN_DIR Directory containing assemblies in FASTA format
-n NAMES, --names NAMES Two sample names (comma-delimited) to be viewed
Settings:
--window_count WINDOW_COUNT Aim to have at least this many comparison windows between assemblies
(default: 50000)
--window_size WINDOW_SIZE Use this defined window size for all pairwise comparisons (default:
dynamically choose window size for each pair)
--ignore_indels Only use mismatches to determine distance (default: use both
mismatches and gap-compressed indels)
--smoothing_factor SMOOTHING_FACTOR
Degree to which the distance distribution is smoothed (default: 0.8)
--secondary SECONDARY Peaks with a mass of at least this fraction of the most massive peak
will be used to produce secondary distances (default: 0.7)
--verbose Output more detail to stderr for debugging (default: only output
basic information)
Alignment:
--index_options INDEX_OPTIONS Minimap2 options for assembly indexing (default: -k15 -w10)
--align_options ALIGN_OPTIONS Minimap2 options for assembly-to-assembly alignment (default: -x
asm20)
--allowed_overlap ALLOWED_OVERLAP
Allow this much overlap between alignments (default: 100)
Plot settings:
--sqrt_distance Use a square-root transform on the genomic distance axis (default:
no distance axis transform)
--sqrt_mass Use a square-root transform on the probability mass axis (default:
no mass axis transform)
--result RESULT Number of result to plot (used when there are multiple possible
results for the pair, default: 1)
Colours:
--vertical_colour VERTICAL_COLOUR
Hex colour for vertical inheritance (default: #4859a0)
--horizontal_colour HORIZONTAL_COLOUR
Hex colour for horizontal inheritance (default: #c47e7e)
--ambiguous_colour AMBIGUOUS_COLOUR
Hex colour for ambiguous inheritance (default: #c9c9c9)
Other:
-h, --help Show this help message and exit
--version Show program's version number and exit