Input data - Pas-Kapli/mptp GitHub Wiki

Input tree

--tree_file tree_filename

The input file for mptp is a binary rooted phylogenetic tree in newick format.

The inference of a phylogenetic tree may be conducted with a Maximum Likelihood (e.g. RAxML, PhyML) or Bayesian Inference (e.g. MrBayes, ExaBayes) approach. Maximum Likelihood phylogenetic trees are always binary trees that can be used immediately as input files in mptp. Bayesian Inference methods result in a collection of trees that are post-analysis summarized in a single consensus tree. To produce a binary tree in the latter case, summarize the results under the option that forces all splits of the tree to be present (e.g in MrBayes use the Contype = "allcompat" and the MRE consensus options in ExaBayes).

If the input phylogenetic tree is not already rooted from the phylogenetic inference step, root it in mptp by providing the outgroup sequence names with the --outgroup option

example newick file

Estimation of minimum branch length

--minbr_auto, --minbr

During phylogenetic inference, the topology of the tree is forced to remain binary at all steps of the optimization in most common implementations. Therefore, very short non-zero branch lengths are enforced among identical sequences, to retain the binary shape of the tree. These branch lengths are significantly smaller than the remaining ones, and thus, it is probable that the method falsely classifies these two groups of branch lengths into speciation and coalescent processes, respectively. To avoid this type of error, prior to the delimitation inference you can automatically detect the correct minimum branch length threshold with the --minbr_auto option. For this step of the analysis, both, the phylogenetic tree, and the alignment (in fasta format) are required. Subtrees exclusively comprising branch lengths smaller or equal to this threshold are subsequently ignored in the delimitation step.

The command line for detecting the minimum branch length threshold, would be:

mptp --tree_file tree_filename --minbr_auto fasta_filename --output_file output_filename

The minimum branch length threshold will be printed in the screen and it will also be stored in the output file. Subsequently you can specify this value for the delimitation step with the --minbr switch. For example:

mptp --tree_file RAxML_bestTree.Clubiona --output_file Clubiona_delimitation --ml --multi --minbr 0.0009330519

example fasta alignment file

Outgroup

--outgroup, --outgroup_crop

Phylogenetic inference outputs unrooted phylogenetic trees, unless an outgroup was specified at the beginning of the analysis. For the delimitation inference it is essential to root the phylogeny with the correct outgroup sample(s). Therefore, if the input tree is unrooted you can use the option --outgroup followed by a comma-separated list of taxa to root the phylogeny. With this option mptp roots the unrooted tree by splitting the branch leading to the most recent common ancestor (MRCA) of the comma-separated list of taxa into two branches of equal size and introducing a new node (the root of the new rooted tree) that connects these two branches. For example:

mptp --tree_file tree_filename --minbr_auto fasta_filename --output_file output_filename --outgroup sample1,sample2,sample3

If you wish to remove the outgroup taxa from your phylogeny you can add the option --outgroup_crop. With this option the subtree with the outgroup taxa (or the lineage if it is only one taxon) is removed prior to the estimation of the delimitation scheme. The command would look like this:

mptp --tree_file tree_filename --minbr_auto fasta_filename --output_file output_filename --outgroup sample1,sample2,sample3 --outgroup_crop