11 Historical Biogeography Analysis - PhyloAI/Ortho2Web GitHub Wiki
11.1 Dating Analysis
MCMCtree is a tool within the Phylogenetic Analysis by Maximum Likelihood (PAML) package designed for estimating divergence times on a phylogenetic tree using Bayesian methods. It allows for time calibration with fossil information, handles uncertainty in the molecular clock model, and provides posterior estimates for node ages, taking into account molecular evolution and fossil calibration constraints.
Step 1: Adding fossil calibration point information
The format for adding calibration point information should look like this:
- The first line should list the number of species and tree, with separated by spaces.
- The second line should contain the calibration point information, followed by a semicolon. Fossil constraints can be added as a fixed time point (e.g., @0.7 to specify a precise age) or as an interval for 95% HPD (e.g., '>.07<.08'). For example, assuming there are 7 species and a calibration point at a certain node, the file should look like:
7 1
((((A, (B, C)) '>.07<.08', D), (E, F)), G);
Step 2: Running mcmctree with usedata = 3
Set up the mcmctree configuration file to specify usedata = 3.
The configuration look like:
seed = -1
seqfile = input.phy
treefile = input.tre
mcmcfile = mcmc.txt
outfile = out.txt
ndata = 1
seqtype = 0 * 0: nucleotides; 1:codons; 2:AAs
usedata = 3 * 0: no data; 1:seq like; 2:normal approximation; 3:out.BV (in.BV)
clock = 2 * 1: global clock; 2: independent rates; 3: correlated rates
RootAge = '<1.0' * safe constraint on root age, used if no fossil for root.
model = 7 * 0:JC69, 1:K80, 2:F81, 3:F84, 4:HKY85
alpha = 0.5 * alpha for gamma rates at sites
ncatG = 5 * No. categories in discrete gamma
cleandata = 0 * remove sites with ambiguity data (1:yes, 0:no)?
BDparas = 1 1 0.1 * birth, death, sampling
kappa_gamma = 6 2 * gamma prior for kappa
alpha_gamma = 1 1 * gamma prior for alpha
rgene_gamma = 2 20 1 * gammaDir prior for rate for genes
sigma2_gamma = 1 10 1 * gammaDir prior for sigma^2 (for clock=2 or 3)
finetune = 1: .1 .1 .1 .1 .1 .1 * auto (0 or 1): times, musigma2, rates, mixing, paras, FossilErr
print = 1 * 0: no mcmc sample; 1: everything except branch rates 2: everything
burnin = 2000
sampfreq = 10
nsample = 20000
mcmctree
# This command runs mcmctree with the configuration where usedata = 3.
# The output of this run will generate a file named out.BV. After the mcmctree run completes, rename the out.BV file to in.BV
Step 3: Running mcmctree with usedata = 2
Modify the mcmctree configuration file to specify usedata = 2.
mcmctree
Step 4: Checking convergence using Tracer
To check if the MCMC (Markov Chain Monte Carlo) process has converged, use the mcmc.txt file generated by mcmctree and analyze it using Tracer.
Tracer will also show the effective sample size (ESS) for each parameter. An ESS greater than 200 is typically considered good and suggests convergence.
11.2 Ancestral Area Reconstruction
Reconstruct Ancestral State in Phylogenies (RASP) is a software tool used for reconstructing the geographic distribution or ancestral states of species on a phylogenetic tree. It is particularly popular in historical biogeography, as it allows researchers to analyze how species distributions and other traits have changed over evolutionary time, helping to infer the historical processes like dispersal, vicariance, and extinction that shaped current biodiversity patterns. RASP is easy to operate using Windows, so no detailed method description is provided here.
11.3 Diversification Analysis
RevBayes is a powerful, flexible software platform designed for Bayesian inference in evolutionary biology, particularly for phylogenetic analysis. It is known for its flexibility and modularity, allowing users to create customized models that go beyond standard phylogenetic methods. RevBayes is particularly popular for complex evolutionary questions involving divergence time estimation, trait evolution, biogeography, and even ecological modeling.
Step 1: Installation
conda install -c conda-forge cmake
conda install -c conda-forge boost-cpp
conda install -c conda-forge git
git clone --branch development https://github.com/revbayes/revbayes.git
cd revbayes/projects/cmake
./build.sh
./build.sh -mpi true #For the MPI version
Step 2: Preparing the specific RevBayes configuration file
The configuration file can be downloaded from here.
Step 3: Running
rb mcmc_EBD_king.Rev
Step 4: Visualization
conda install -c conda-forge imagemagick
conda install -c conda-forge r-devtools
install.packages("devtools")
devtools::install_github("cmt2/RevGadgets")
Rscript plot_EBD.R