Inference - sungsik-kong/PhyNEST.jl GitHub Wiki

Composite likelihood of a fixed topology

PhyNEST extends the composite likelihood (sometimes referred to as pseudolikelihood) function for trees proposed in Peng et al. (2022) and see Kong et al. (2022) for details. Below, we demonstrate obtaining the composite likelihood of a fixed topology by optimizing a set of relevant parameters (i.e., node ages $\tau$, inheritance probability $\gamma$, and population size parameter $\theta$).

We are going to assume the input sequence alignment data has been parsed following Parsing input data page. Now, the topology that we are interested in should be specified in a Newick (for trees) or an extended Newick (for networks) format using the function readTopology(args). Note the names of taxa in the tree must also be present in the input alignment. Below, we specify a simple five taxon network with a single hybridization and name it as topology_0:

julia> topology_0=readTopology("(outgroup,(species_4,((species_1,(species_2)#H6:::0.5),(species_3,#H6:::0.5))));")
Click here to see the output
julia> topology_0=readTopology("(outgroup,(species_4,((species_1,(species_2)#H6:::0.5),(species_3,#H6:::0.5))));")
PhyloNetworks.HybridNetwork, Rooted Network
11 edges
11 nodes: 5 tips, 1 hybrid nodes, 5 internal tree nodes.
tip labels: outgroup, species_4, species_1, species_2, ...
(outgroup,(species_4,((species_1,(species_2)#H6:::0.5),(species_3,#H6:::0.5))));

Now we are ready to compute the composite likelihood of the network topology_0 given the data phylip_data. We use the function do_optimization(topology_0, phylip_data), which returns two things: some specific result from the optimization procedure and the topology with updated parameters. For this tutorial, we are going to name these outputs as stats and topology_1, respectively. An example command is shown below.

julia> stats,topology_1=do_optimization(topology_0,phylip_data)
Click here to see the output
julia> stats,topology_1=do_optimization(topology_0,phylip_data)
( * Status: success

* Candidate solution
   Final objective value:     4.741345e+06

* Found with
   Algorithm:     BFGS

* Convergence measures
   |x - x'|               = 5.55e-16 ≰ 0.0e+00
   |x - x'|/|x'|          = 1.21e-16 ≰ 0.0e+00
   |f(x) - f(x')|         = 0.00e+00 ≤ 0.0e+00
   |f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00
   |g(x)|                 = 1.45e-02 ≰ 1.0e-08

* Work counters
   Seconds run:   0  (vs limit Inf)
   Iterations:    38
   f(x) calls:    120
   ∇f(x) calls:   39
, PhyloNetworks.HybridNetwork, Rooted Network
11 edges
11 nodes: 5 tips, 1 hybrid nodes, 5 internal tree nodes.
tip labels: outgroup, species_4, species_1, species_2, ...
(outgroup:9.873,(species_4:4.979,((species_1:1.43,(species_2:1.43)#H6:0.0::0.475):1.567,(species_3:1.57,#H6:0.14::0.525):1.427):1.981):4.894);
)

The stats portion of do_optimization outputs may not be interest for the majority of users. However, it contains some useful information. The composite likelihood score, for instance, is stored in stats.minimum.

julia> stats.minimum
4.7413446340573225e6

You can learn more here about other information stored in stats.

Next: Prerequisite

⚠️ **GitHub.com Fallback** ⚠️