BootScan analysis - Stephane-S/Simplot_PlusPlus GitHub Wiki

Methodology

Bootscanning is a pipeline consisting of 4 main steps, all done using a sliding window analysis (as in the SimPlot analysis).

Steps

  1. The subsequences extracted from the consensus groups are bootstrapped N times.
  2. For each of the N bootstrapped sub-MSAs, a distance matrix is generated.
  3. A phylogenetic tree is inferred for each distance matrix (either with Neighbor-Joining or UPGMA).
  4. The conflicting phylogenetic signals are quantified and expressed as the % of trees where each sequence is the nearest neighbor of the reference sequence.

Main features

  • 43 DNA distance models are available for generating the distance matrices
  • Multiprocessing functionality is available
  • MatPlotlib-based plots with a toolbar to easily customize and save the outputs in multiple formats
  • Plots can be viewed in a pop-up window (with the toolbar)

Settings

Bootstrap: Number of replicates to be generated for each sub-MSA (each position of the sliding window)

Tree model: Neighbor-Joining or UPGMA

Window length: Size of the sliding window

Step: Sliding window advancement step

Distance Model: 43 DNA substitution models are available

Multiprocessing: Allows multiple windows to be analyzed simultaneously (recommended for large datasets)

BootScan analysis example

Bootscan gif