Species Tree inference with Astral - bpp/bpp-tutorial-geneflow GitHub Wiki
Astral is species tree inference program that takes as input a set of pre-computed unrooted gene-trees. Astral is statistically consistent under the multi-species coalescent model and it is therefore often used as an alternative to the concatenation approach for assessing whether the case under study might be affected by the Anomaly Zone.
Download the data:
mkdir baobap-astral
cd baobap-astral
#Download the data in the new folder
wget https://github.com/bpp/bpp-tutorial-geneflow/raw/main/data/baobap-loci.tar.gz
tar -xvzf baobap-loci.tar.gz
rm baobap-loci.tar.gz
Astral runs in two steps:
Step 1: Estimation of the gene-trees.
We saw how to do that on the first day of the workshop:
# This will take about 2 minutes
for i in locus-*; do iqtree2 -m GTR+G -s $i; done
#Collect all the ML trees in a single file
cat *treefile > baobap-mltrees.txt
#Create a new folder
mkdir species-tree
#Move your trees in the new folder
mv baobap-mltrees.txt species-tree
cd species-tree
#Or download the file with the trees in the "species-tree" folder if you prefer:
wget https://raw.githubusercontent.com/bpp/bpp-tutorial-geneflow/main/data/baobap-mltrees.txt
Step 2: Tree inference with astral.
Input files
Astral requires primarily one input file, a simple text file with all the gene-trees in newick format like the one we created above. However, if the dataset contains multiple individuals from the same species it is also helpful to include a "mapping file" with the following format:
species_name [number of individuals] individual_1 individual_2 ...
species_name:individual_1,individual_2,...
Particularly for the baobaps the map file looks like this:
Adig:Adi001,Adi002
Agra:Aga001,Aga002
Agre:Age001
Amad:Ama006,Ama018
Arub:Aru001,Aru127
Smic:Smi165
Download it as follows:
wget https://raw.githubusercontent.com/bpp/bpp-tutorial-geneflow/main/data/baobab.Astral.map.txt
Running Astral
Having the map file ("baobab.Astral.map.txt") and the input gene-trees ("baobap-mltrees.txt"), we can now run Astral:
astral -i baobap-mltrees.txt -o baobap-astral.tre -a baobab.Astral.map.txt 2> baobap-astral.log
You can visualize the astral tree on your computer using e.g. seaview or figtree
Astral Output
Newick tree
The output file of Astral is an unrooted newick tree and can be viewed with any tree viewer such as Seaview, Figtree etc.
Branch lengths
The branch lengths in the tree are in coalescent units, i.e., a direct measure of the amount of discordance in the gene trees. As such, they are prone to underestimation because of statistical noise in gene tree estimation. They are sensible only for internal branches and those terminal branches that correspond to species with more than one individuals sampled.
Support values
Branch support values measure the support for a quadripartition (the four clusters around a branch) and not the bipartition, as is commonly done.
Q: is your Astral tree compatible with the tree here
BPP assumptions
Next:Species Tree Inference with Astral | BPP assumptions | BPP control file | Species Tree Inference with BPP | Parameter Estimation with BPP