Species Tree inference with Astral - Pas-Kapli/CoME-Tutorials GitHub Wiki

Astral is species tree inference program that takes as input a set of pre-computed unrooted gene-trees. Astral is statistically consistent under the multi-species coalescent model and it is therefore often used as an alternative to the concatenation approach for assessing whether the case under study might be affected by the Anomaly Zone.

cd lizard-exercise/phylo
mkdir astral-tree
cd astral-tree

Astral runs in two steps:

Step 1: Estimation of the gene-trees.

We saw how to do that here:

#Collect all the ML trees in a single file 
cat ../all-GTR/*treefile > all-GTR.trees
cat ../all-modeltest/*treefile > all-modeltest.trees

Step 2: Tree inference with astral.

Input files

Astral requires primarily one input file, a simple text file with all the gene-trees in newick format like the one we created above. However, if the dataset contains multiple individuals from the same species it is also helpful to include a "mapping file" with the following format:

Running Astral

Having the input gene-trees ("all-GTR.trees" or "all-modeltest.trees"), we can now run Astral:

astral -i all-GTR.trees -o astral-all-GTR.tree 2> astral-all-GTR.log
astral -i all-modeltest.trees -o astral-all-modeltest.tree 2> astral-all-modeltest.log

Astral Output

Newick tree

The output file of Astral is an unrooted newick tree and can be viewed with any tree viewer such as Seaview, Figtree etc.

Branch lengths

The branch lengths in the tree are in coalescent units, i.e., a direct measure of the amount of discordance in the gene trees. As such, they are prone to underestimation because of statistical noise in gene tree estimation. They are sensible only for internal branches and those terminal branches that correspond to species with more than one individuals sampled.

Support values

Branch support values measure the support for a quadripartition (the four clusters around a branch) and not the bipartition, as is commonly done.

Q: is your Astral tree similar to the concatenation and published tree?