How to get the average nLTT plot - thijsjanzen/nLTT GitHub Wiki
Calculating the average nLTT plot of multiple phylogenies
is not a trivial tasks. The function get_nltt_values
does exactly this.
Examples
For all examples, the following R code must be run:
library(ape)
library(Hmisc)
library(nLTT) # nolint
library(ggplot2)
library(testit)
Example: Easy trees
Create two easy trees:
newick1 <- "((A:1,B:1):2,C:3);"
newick2 <- "((A:2,B:2):1,C:3);"
phylogeny1 <- ape::read.tree(text = newick1)
phylogeny2 <- ape::read.tree(text = newick2)
phylogenies <- c(phylogeny1, phylogeny2)
There are very similar. phylogeny1
has short tips:
plot(phylogeny1)
add.scale.bar() #nolint
This can be observed in the nLTT plot:
nltt_plot(phylogeny1, ylim = c(0, 1))
As a collection of timepoints:
t <- nLTT::get_phylogeny_nltt_matrix(phylogeny1)
knitr::kable(t)
| time| N|
|---------:|---------:|
| 0.0000000| 0.3333333|
| 0.6666667| 0.6666667|
| 1.0000000| 1.0000000|
Plotting those timepoints:
df <- as.data.frame(nLTT::get_phylogeny_nltt_matrix(phylogeny1))
qplot(
time, N, data = df, geom = "step", ylim = c(0, 1), direction = "vh",
main = "NLTT plot of phylogeny 1"
)
phylogeny2
has longer tips:
plot(phylogeny2)
add.scale.bar() #nolint
Also this can be observed in the nLTT plot:
nltt_plot(phylogeny2, ylim = c(0, 1))
As a collection of timepoints:
t <- nLTT::get_phylogeny_nltt_matrix(phylogeny2)
knitr::kable(t)
| time| N|
|---------:|---------:|
| 0.0000000| 0.3333333|
| 0.3333333| 0.6666667|
| 1.0000000| 1.0000000|
Plotting those timepoints:
df <- as.data.frame(nLTT::get_phylogeny_nltt_matrix(phylogeny2))
qplot(
time, N, data = df, geom = "step", ylim = c(0, 1), direction = "vh",
main = "NLTT plot of phylogeny 2"
)
The average nLTT plot should be somewhere in the middle.
It is constructed from stretched nLTT matrices.
Here is the nLTT matrix of the first phylogeny:
t <- nLTT::stretch_nltt_matrix(
nLTT::get_phylogeny_nltt_matrix(phylogeny1), dt = 0.20, step_type = "upper"
)
knitr::kable(t)
Here is the nLTT matrix of the second phylogeny:
t <- nLTT::stretch_nltt_matrix(
nLTT::get_phylogeny_nltt_matrix(phylogeny2), dt = 0.20, step_type = "upper"
)
knitr::kable(t)
Here is the average nLTT matrix of both phylogenies:
t <- nLTT::get_average_nltt_matrix(phylogenies, dt = 0.20)
knitr::kable(t)
Observe how the numbers get averaged.
The same, now shown as a plot:
nLTT::get_average_nltt(phylogenies, dt = 0.20, plot_nltts = TRUE)
Here a demo how the new function works:
t <- nLTT::get_nltt_values(c(phylogeny1, phylogeny2), dt = 0.2)
knitr::kable(t)
Plotting options, first create a data frame:
df <- nLTT::get_nltt_values(c(phylogeny1, phylogeny2), dt = 0.01)
Here we see an averaged nLTT plot, where the original nLTT values are still visible:
qplot(
t, nltt, data = df, geom = "point", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies", color = id, size = I(0.1)
) +
stat_summary(fun.data = "mean_cl_boot", color = "red", geom = "smooth")
Here we see an averaged nLTT plot, with the original nLTT values omitted:
qplot(t, nltt, data = df, geom = "blank", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies"
) +
stat_summary(fun.data = "mean_cl_boot", color = "red", geom = "smooth")
Example: Harder trees
Create two harder trees:
newick1 <- "((A:1,B:1):1,(C:1,D:1):1);"
newick2 <- paste0("((((XD:1,ZD:1):1,CE:2):1,(FE:2,EE:2):1):4,((AE:1,BE:1):1,",
"(WD:1,YD:1):1):5);"
)
phylogeny1 <- ape::read.tree(text = newick1)
phylogeny2 <- ape::read.tree(text = newick2)
phylogenies <- c(phylogeny1, phylogeny2)
There are different. phylogeny1
is relatively simple, with two branching events happening at the same time:
plot(phylogeny1)
add.scale.bar() #nolint
This can be observed in the nLTT plot:
nltt_plot(phylogeny1, ylim = c(0, 1))
As a collection of timepoints:
t <- nLTT::get_phylogeny_nltt_matrix(phylogeny2)
knitr::kable(t)
phylogeny2
is more elaborate:
plot(phylogeny2)
add.scale.bar() #nolint
Also this can be observed in the nLTT plot:
nltt_plot(phylogeny2, ylim = c(0, 1))
As a collection of timepoints:
t <- nLTT::get_phylogeny_nltt_matrix(phylogeny2)
knitr::kable(t)
The average nLTT plot should be somewhere in the middle.
It is constructed from stretched nLTT matrices.
Here is the nLTT matrix of the first phylogeny:
t <- nLTT::stretch_nltt_matrix(
nLTT::get_phylogeny_nltt_matrix(phylogeny1), dt = 0.20, step_type = "upper"
)
knitr::kable(t)
Here is the nLTT matrix of the second phylogeny:
t <- nLTT::stretch_nltt_matrix(
nLTT::get_phylogeny_nltt_matrix(phylogeny2), dt = 0.20, step_type = "upper"
)
knitr::kable(t)
Here is the average nLTT matrix of both phylogenies:
t <- nLTT::get_average_nltt_matrix(phylogenies, dt = 0.20)
knitr::kable(t)
Observe how the numbers get averaged.
Here a demo how the new function works:
t <- nLTT::get_nltt_values(c(phylogeny1, phylogeny2), dt = 0.2)
knitr::kable(t)
Plotting options, first create a data frame:
df <- nLTT::get_nltt_values(c(phylogeny1, phylogeny2), dt = 0.01)
Here we see an averaged nLTT plot, where the original nLTT values are still visible:
qplot(
t, nltt, data = df, geom = "point", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies", color = id, size = I(0.1)
) +
stat_summary(fun.data = "mean_cl_boot", color = "red", geom = "smooth")
Here we see an averaged nLTT plot, with the original nLTT values omitted:
qplot(t, nltt, data = df, geom = "blank", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies"
) +
stat_summary(fun.data = "mean_cl_boot", color = "red", geom = "smooth")
Example: Five random trees
Create three random trees:
set.seed(42)
phylogeny1 <- rcoal(10)
phylogeny2 <- rcoal(20)
phylogeny3 <- rcoal(30)
phylogeny4 <- rcoal(40)
phylogeny5 <- rcoal(50)
phylogeny6 <- rcoal(60)
phylogeny7 <- rcoal(70)
phylogenies <- c(phylogeny1, phylogeny2, phylogeny3,
phylogeny4, phylogeny5, phylogeny6, phylogeny7)
Here a demo how the new function works:
t <- nLTT::get_nltt_values(phylogenies, dt = 0.2)
knitr::kable(t)
Here we see an averaged nLTT plot, where the original nLTT values are still visible:
qplot(t, nltt, data = df, geom = "point", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies", color = id, size = I(0.1)
) +
stat_summary(fun.data = "mean_cl_boot", color = "red", geom = "smooth")
Here we see an averaged nLTT plot, with the original nLTT values omitted:
qplot(t, nltt, data = df, geom = "blank", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies"
) +
stat_summary(fun.data = "mean_cl_boot", color = "red", geom = "smooth")