Parameter estimation under the MSC‐I (Baobab dataset, model A) - bpp/bpp-tutorial-geneflow GitHub Wiki

In this demo we will infer the parameters for the MSC-I model with the following topology using the Baobabs dataset.

We will use the following folder structure:

.
|-- baobab
|   |-- A00
|   |   |-- baobab.A00.ctl
|   |   `-- baobab.A00.msci.ctl
|   |-- A01
|   |   `-- baobab.A01.ctl
|   |-- baobab.map.txt
|   `-- baobab.phy
`-- bpp
    `-- a00
        |-- r1
        |   `-- baobab.A00.msci.ctl
        `-- r2
            `-- baobab.A00.msci.ctl

8 directories, 7 files

Below are the steps to create the folder structure, download the data files, and copy control files to the folders where we will run BPP:

# create folder structure
mkdir -p bpp/a00/r{1,2}

# download and extract data files
wget https://github.com/bpp/bpp-tutorial-geneflow/raw/main/data/baobab.tar.gz
tar zxvf baobab.tar.gz
rm baobab.tar.gz

# copy control files to the right folders
cp baobab/a00/baobab.A00.msci.ctl bpp/a00/r1
cp baobab/a00/baobab.A00.msci.ctl bpp/a00/r2

You can check if your folder structure resembles the one described above using the commands:

tree --charset=ascii

Constructing the newick string for an MSC-I model

Create a file called baobab-msci-defs.txt and use the following specification:

tree (Smic,(Agre,(Adig,(Arub,(Agra,Amad)e)d)c)b)a;
hybridization d Arub, e Amad as y z tau=yes,yes phi=0.1

You can create the file with the following commands without the need of using an editor:

cd bpp/a00
echo 'tree (Smic,(Agre,(Adig,(Arub,(Agra,Amad)e)d)c)b)a;' > baobab-msci-defs.txt
echo 'hybridization d Arub, e Amad as y z tau=yes,yes phi=0.1' >> baobab-msci-defs.txt 

and then run the --msci-create switch:

bpp --msci-create baobab-msci-defs.txt

You should obtain the following newick string:

(Smic, (Agre,(Adig,((z[&phi=0.100000,tau-parent=yes],Arub)y,(Agra,(Amad)z[&phi=0.900000,tau-parent=yes])e)d)c)b)a;

For more information on the --msci-create option, see the documentation.

We will use the following control file for the inference (which already includes the correct MSC-I model):

          seed =  -1

       seqfile = ../../../baobab/baobab.phy
      Imapfile = ../../../baobab/baobab.map.txt
       jobname = model3

 speciesdelimitation = 0
         speciestree = 0

#  speciesmodelprior = 1  # 0: uniform LH; 1:uniform rooted trees; 2: uniformSLH; 3: uniformSRooted

  species&tree = 6  Adig Agra Agre Amad Arub Smic
                    3    2    1    2    2    1
                    (Smic, (Agre,(Adig,((z[&phi=0.200000,tau-parent=yes],Arub)y,(Agra,(Amad)z[&phi=0.800000,tau-parent=yes])e)d)c)b)a;
         phase = 0 0 0 0 0 0
                  
        sedata = 1    # 0: no data (prior); 1:seq like
         nloci = 100  # number of data sets in seqfile

     cleandata = 0    # remove sites with ambiguity data (1:yes, 0:no)?

    thetaprior = 2 0.01    # invgamma(a, b) for theta
      tauprior = 3 0.07    # invgamma(a, b) for root tau
      phiprior = 1 1       # Beta(a,b)

     locusrate = 0
         clock = 1 

      finetune = 1
         print = 1 0 0 0   # MCMC samples, locusrate, heredityscalars, Genetrees
        burnin = 50000
      sampfreq = 2
       nsample = 50000

    #  threads = 4 1 1

Prior distribution

Note that we have the new keyword phiprior in the control file, which is used to assign a Beta prior on the parameter $\varphi$ (introgression probability).

The syntax is

betaprior = alpha beta

where alpha and beta describe the Beta distribution $Beta(\alpha,\beta)$, with mean $\alpha / (\alpha + \beta)$. Below are some examples of a Beta distribution, and an Inverse Gamma distribution which we used to assign a prior on taus and thetas.

We typically use betaprior = 1 1 which is equivalent to the uniform distribution.

Running BPP

Check that the BPP control files in bpp/a00/r1 and bpp/a00/r2 point to the right PHYLIP (sequence alignment) and map files. They should have the following form:

 seqfile = ../../../baobab/baobab.phy
Imapfile = ../../../baobab/baobab.map.txt

You can check by:

cd bpp/a00
cat r1/baobab.A00.msci.ctl
cat r2/baobab.A00.msci.ctl

Finally, run bpp:

cd r1
bpp --cfile baobab.A00.msci.ctl

If you run into problems with core pinning, or BPP exits with the error: "Error while pinning thread to core." then try running BPP with the --no-pin option:

bpp --no-pin --cfile baobab.A00.msci.ctl

We should run the program at least twice and compare the outputs to check for convergence.

cd ../r2
bpp --cfile baobab.A00.msci.ctl

Synchronization point: Make sure everybody reached this stage


The first part of the output is similar to the output of the A00 run for the MSC model. In addition, it includes a table detailing the node relationships within the species tree, as well as information on nodes participating in hybridization or introgression events. See a sample output file for the MSC-I analysis.

Analysis started at: Tue Dec  3 14:49:45 2024
Using BPP version: 4.8.2
pver sha1: 1a053a3b61831dc4fb680235a5f6e63e8158ea54
Command: bpp --cfile baobab.A00.msci.ctl 

Seed: 183243873 (randomly generated)
Initial species tree: (Smic, (Agre,(Adig,((z[&phi=0.200000,tau-parent=yes],Arub)y,(Agra,(Amad)z[&phi=0.800000,tau-parent=yes])e)d)c)b)a;
Species tree contains hybridization/introgression events.

Hybridization events: 1
Bidirectional introgressions: 0
Label        Node  Child1  Child2  Parent
Adig            0     N/A     N/A       8
Agra            1     N/A     N/A      11
Agre            2     N/A     N/A       7
Amad            3     N/A     N/A      12
Arub            4     N/A     N/A      10
Smic            5     N/A     N/A       6
a               6       5       7     N/A
b               7       2       8       6
c               8       0       9       7
d               9      10      11       8
y              10      13       4       9
e              11       1      12       9
z              12       3     N/A      11   [tau = 1, phi = 0.800000, prop_tau = 1, has_phi = 1]  phi_z : e -> z
z              13     N/A     N/A      10   Mirrored hybridization node [Hybrid = z (12)]   [tau = 1, phi = 0.200000, prop_tau = 0, has_phi = 0]

 Locus | Model | Sequences | Length | Ambiguous sites | Compressed | Base freqs 
-------+-------+-----------+--------+-----------------+------------+------------
     1 |  JC69 |        10 |    960 |              12 |         19 |      Fixed 
     2 |  JC69 |        10 |    933 |              72 |         18 |      Fixed 
.
.
.

The last two lines of the table:

z              12       3     N/A      11   [tau = 1, phi = 0.800000, prop_tau = 1, has_phi = 1]  phi_z : e -> z
z              13     N/A     N/A      10   Mirrored hybridization node [Hybrid = z (12)]   [tau = 1, phi = 0.200000, prop_tau = 0, has_phi = 0]

indicate that node z is a so-called "hybridization" node, and has two parents (hence listed twice), node e and y with indices 11 and 10. Furthermore, it defines the parameter phi_z : e -> z ($\varphi_z$) as the proportion of genetic material in z originating from the ancestor e.

The next part provides a description of the values printed on screen during the MCMC run. BPP prints many values on screen which we can cluster into three categories:

  • The first column is a progress indicator with the percentage of the the completed MCMC run. Negative progress means we are still in the burn-in.
  • The next block of columns shows the acceptance proportions for each activated MCMC move. In our case we have seven active proposals.
  • The remaining columns indicate information about the chain at that point in the run. Those are running means for three thetas and three taus, the phi parameter, the sum of gene tree log-densities, and the mean log-likelihood.

Since finetune was set to 1, BPP optimizes step-length values during burn-in, such that acceptance proportions (Pjump) are around 30%. To achieve this, BPP typically does four rounds of optimization during the burn-in and prints on screen the Current Pjump (acceptance proportions for each enabled MCMC move at that step), Current finetune (step-length/tuning parameters at that step) and New finetune (the new optimized tuning parameter values).

-*- Terms index -*-

  Prgs: progress of MCMC run (negative progress means burnin)
  Gage: gene-tree age proposal
  Gspr: gene-tree SPR proposal
   th1: species tree theta proposal (tips)   (sliding window)
   th2: species tree theta proposal (inner)  (sliding window)
   thg: species tree theta proposal (inner)  (gibbs sampler)
   tau: species tree tau proposal
   mix: mixing proposal
  phis: MSCi phi parameter proposal (sliding window)
  phig: MSCi phi parameter proposal (gibbs sampler)
theta1: mean theta of node 1
theta2: mean theta of node 2
theta3: mean theta of node 4
  tau1: mean tau of node 6
  tau2: mean tau of node 7
  tau3: mean tau of node 8
  phi1: mean of phi_z : e -> z
log-PG: log-probability of gene trees (MSCi)
 log-L: mean log-L of observing data

     |            Acceptance proportions            |
Prgs | Gage Gspr  th1  th2  thg  tau  mix phis phig | theta1 theta2 theta3    tau1   tau2   tau3    phi1       log-PG          log-L
------------------------------------------------------------------------------------------------------------------------------------

                     Gage    Gspr     th1     th2     tau     mix    phis  
Current Pjump:     0.32210 0.41077 0.66919 0.61344 0.12646 0.00616 0.99418
Current finetune:  5.00000 0.00100 0.00050 0.00200 0.00100 0.30000 0.00100
New finetune:      5.43702 0.00148 0.00172 0.00565 0.00040 0.00570 0.21467

=> 'finetune = 1 Gage:5.437019 Gspr:0.001477 th1:0.001715 th2:0.005650 tau:0.000395 mix:0.005697 phis:0.214669'

  -5%  0.32 0.34 0.30 0.39 0.94 0.20 0.51 0.08 1.00   0.0061 0.0054 0.0047  0.0128 0.0049 0.0048  0.7186   3495.93925  -451077.39371  1:33

                     Gage    Gspr     th1     th2     tau     mix    phis  
Current Pjump:     0.32165 0.33510 0.29918 0.38715 0.20223 0.51328 0.08053
Current finetune:  5.43702 0.00148 0.00172 0.00565 0.00040 0.00570 0.21467
New finetune:      5.90246 0.00168 0.00171 0.00772 0.00025 0.01166 0.05358

=> 'finetune = 1 Gage:5.902462 Gspr:0.001685 th1:0.001710 th2:0.007719 tau:0.000255 mix:0.011658 phis:0.053581'


                     Gage    Gspr     th1     th2     tau     mix    phis  
Current Pjump:     0.32286 0.31174 0.26369 0.32104 0.31699 0.34000 0.37608
Current finetune:  5.90246 0.00168 0.00171 0.00772 0.00025 0.01166 0.05358
New finetune:      6.43645 0.00176 0.00148 0.00836 0.00027 0.01353 0.07052

=> 'finetune = 1 Gage:6.436446 Gspr:0.001762 th1:0.001475 th2:0.008361 tau:0.000272 mix:0.013531 phis:0.070522'

   0%  0.32 0.30 0.34 0.36 0.93 0.30 0.30 0.36 1.00   0.0060 0.0054 0.0043  0.0126 0.0049 0.0048  0.6638   3483.53752  -451074.91541  3:05

                     Gage    Gspr     th1     th2     tau     mix    phis  
Current Pjump:     0.32315 0.30354 0.33632 0.35849 0.29838 0.30304 0.35553
Current finetune:  6.43645 0.00176 0.00148 0.00836 0.00027 0.01353 0.07052
New finetune:      7.02621 0.00179 0.00169 0.01036 0.00027 0.01369 0.08648

=> 'finetune = 1 Gage:7.026211 Gspr:0.001787 th1:0.001690 th2:0.010359 tau:0.000270 mix:0.013692 phis:0.086478'

   5%  0.32 0.30 0.29 0.22 0.93 0.30 0.30 0.46 1.00   0.0061 0.0054 0.0052  0.0129 0.0049 0.0048  0.6155   3478.90683  -451078.48487  4:38
  10%  0.32 0.30 0.30 0.24 0.93 0.30 0.30 0.44 1.00   0.0060 0.0054 0.0050  0.0130 0.0049 0.0048  0.6773   3490.21027  -451079.30802  6:11
  15%  0.32 0.30 0.31 0.24 0.93 0.30 0.30 0.43 1.00   0.0060 0.0054 0.0050  0.0130 0.0049 0.0048  0.6627   3461.78647  -451079.37605  7:44
  20%  0.32 0.30 0.31 0.25 0.93 0.30 0.30 0.42 1.00   0.0060 0.0054 0.0049  0.0130 0.0049 0.0048  0.6783   3525.40702  -451078.93763  9:17
  25%  0.32 0.30 0.31 0.25 0.93 0.30 0.30 0.42 1.00   0.0060 0.0054 0.0048  0.0130 0.0049 0.0048  0.6882   3513.66774  -451079.14222  10:49
  30%  0.32 0.30 0.32 0.26 0.93 0.30 0.31 0.41 1.00   0.0060 0.0054 0.0049  0.0130 0.0049 0.0048  0.7008   3524.37237  -451079.29153  12:22
  35%  0.32 0.30 0.32 0.26 0.93 0.30 0.31 0.41 1.00   0.0060 0.0054 0.0048  0.0129 0.0049 0.0048  0.7024   3546.28025  -451078.84802  13:55
  40%  0.32 0.30 0.32 0.26 0.93 0.30 0.31 0.41 1.00   0.0060 0.0054 0.0048  0.0129 0.0049 0.0048  0.7064   3519.00525  -451078.61262  15:28
  45%  0.32 0.30 0.32 0.26 0.93 0.30 0.30 0.41 1.00   0.0060 0.0054 0.0048  0.0129 0.0049 0.0048  0.7083   3530.47601  -451078.44390  17:00
  50%  0.32 0.30 0.32 0.26 0.93 0.30 0.30 0.41 1.00   0.0060 0.0054 0.0048  0.0129 0.0049 0.0048  0.7082   3520.59248  -451078.30874  18:33
  55%  0.32 0.30 0.32 0.26 0.93 0.30 0.30 0.40 1.00   0.0060 0.0054 0.0048  0.0129 0.0049 0.0048  0.7117   3521.83246  -451078.43689  20:06
  60%  0.32 0.30 0.32 0.27 0.93 0.30 0.30 0.40 1.00   0.0060 0.0054 0.0048  0.0129 0.0049 0.0048  0.7114   3535.56803  -451078.62761  21:39
  65%  0.32 0.30 0.32 0.27 0.93 0.30 0.30 0.40 1.00   0.0060 0.0054 0.0047  0.0129 0.0049 0.0048  0.7110   3512.17479  -451078.59613  23:11
  70%  0.32 0.30 0.32 0.27 0.93 0.30 0.30 0.40 1.00   0.0060 0.0054 0.0047  0.0129 0.0049 0.0048  0.7123   3514.93797  -451078.53296  24:44
  75%  0.32 0.30 0.32 0.27 0.93 0.30 0.30 0.40 1.00   0.0060 0.0054 0.0047  0.0130 0.0049 0.0048  0.7099   3495.25222  -451078.69395  26:17
  80%  0.32 0.30 0.32 0.27 0.93 0.30 0.30 0.40 1.00   0.0060 0.0054 0.0048  0.0130 0.0049 0.0048  0.7006   3484.34576  -451078.72453  27:50
  85%  0.32 0.30 0.32 0.27 0.93 0.30 0.30 0.40 1.00   0.0060 0.0054 0.0048  0.0130 0.0049 0.0048  0.7005   3548.81521  -451078.64646  29:22
  90%  0.32 0.30 0.32 0.27 0.93 0.30 0.30 0.40 1.00   0.0060 0.0054 0.0048  0.0129 0.0049 0.0048  0.7010   3525.24780  -451078.55554  30:55
  95%  0.32 0.30 0.32 0.27 0.93 0.30 0.30 0.39 1.00   0.0060 0.0054 0.0048  0.0129 0.0049 0.0048  0.7038   3516.26405  -451078.52789  32:28
 100%  0.32 0.30 0.33 0.27 0.94 0.30 0.30 0.39 1.00   0.0060 0.0054 0.0048  0.0129 0.0049 0.0048  0.7057   3531.22064  -451078.50442  34:01

34:01 spent in MCMC

The last part provides a summary:

Node-Index  Node-Type  Node-Label
---------------------------------
1           Tip        Adig
2           Tip        Agra
3           Tip        Agre
4           Tip        Amad
5           Tip        Arub
6           Tip        Smic
7           Root       a
8           Inner      b
9           Inner      c
10          Inner      d
11          Inner      y
12          Inner      e
13          Hybrid     z (parental node index: 12)
14          Hybrid     z (parental node index: 11)


  param       mean     median      S.D       min       max       2.5%     97.5%    2.5%HPD   97.5%HPD      ESS*        Eff*      rho1  
---------------------------------------------------------------------------------------------------------------------------------------
   theta:1  0.006046  0.005992   0.000713  0.003682  0.010517  0.004805  0.007586  0.004708  0.007454  91709.650515  0.917097  0.011879
   theta:2  0.005427  0.005373   0.000696  0.003257  0.011574  0.004222  0.006948  0.004136  0.006826  81130.730751  0.811307  0.024243
   theta:4  0.004796  0.004374   0.001802  0.002407  0.057922  0.003331  0.009903  0.002971  0.007911    368.913065  0.003689  0.647308
   theta:5  0.006775  0.006599   0.001323  0.003823  0.039991  0.005017  0.009437  0.004803  0.008942   2619.808775  0.026198  0.359550
   theta:7  0.034701  0.034601   0.002837  0.024636  0.051332  0.029443  0.040541  0.029184  0.040249   3212.801087  0.032128  0.220521
   theta:8  0.022607  0.022554   0.001588  0.016439  0.030289  0.019650  0.025872  0.019580  0.025795  17414.283618  0.174143  0.186040
   theta:9  0.011312  0.010355   0.005697  0.000444  0.052990  0.003113  0.024960  0.002043  0.022583  63263.423686  0.632634  0.105424
  theta:10  0.010334  0.009946   0.004019  0.000643  0.039277  0.003676  0.019284  0.003099  0.018316   4247.798937  0.042478  0.402721
  theta:11  0.007608  0.006940   0.003135  0.000344  0.038886  0.003500  0.015557  0.002807  0.013932   1227.660414  0.012277  0.456575
  theta:12  0.009769  0.009076   0.004438  0.000278  0.047740  0.003167  0.020410  0.002233  0.018468  32069.801571  0.320698  0.239177
  theta:13  0.006919  0.005463   0.004992  0.000168  0.048492  0.001414  0.019961  0.000720  0.017097   4135.111121  0.041351  0.353669
  theta:14  0.009404  0.008289   0.005775  0.000052  0.052938  0.001719  0.023506  0.000865  0.020724   2150.289066  0.021503  0.094960
     tau:7  0.012937  0.012927   0.000740  0.010421  0.015464  0.011532  0.014361  0.011532  0.014361    657.256612  0.006573  0.948045
     tau:8  0.004929  0.004927   0.000169  0.004273  0.005760  0.004605  0.005267  0.004597  0.005259   5056.179222  0.050562  0.836954
     tau:9  0.004813  0.004813   0.000159  0.004094  0.005520  0.004505  0.005128  0.004502  0.005124   6868.096254  0.068681  0.773535
    tau:10  0.004208  0.004226   0.000250  0.002959  0.005030  0.003659  0.004645  0.003704  0.004679    851.046036  0.008510  0.723260
    tau:11  0.002800  0.002836   0.000408  0.000252  0.004076  0.001867  0.003465  0.002052  0.003545   1096.984698  0.010970  0.883066
    tau:12  0.003533  0.003550   0.000287  0.000719  0.004522  0.002957  0.004025  0.002998  0.004055    969.361546  0.009694  0.648226
    tau:13  0.002276  0.002435   0.000644  0.000008  0.003721  0.000642  0.003138  0.000709  0.003186    409.198900  0.004092  0.960513
phi:13<-12  0.705551  0.750541   0.191548  0.000002  0.999934  0.106836  0.952049  0.304632  0.999934    227.342841  0.002273  0.919602

       lnL  -451078.569687  -451078.259000  24.268703  -451189.626000  -450974.965000  -451126.941000  -451031.786000  -451125.916000  -451030.930000   7341.583387  0.073416  0.252027

List of nodes, taus and thetas:
Node (+1)       Tau      Theta    Label
0          0.000000   0.006046    Adig [ Adig ]
1          0.000000   0.005427    Agra [ Agra ]
2          0.000000  -1.000000    Agre [ Agre ]
3          0.000000   0.004796    Amad [ Amad ]
4          0.000000   0.006775    Arub [ Arub ]
5          0.000000  -1.000000    Smic [ Smic ]
6          0.012937   0.034701       a [ Adig Agra Agre Amad Arub Smic ]
7          0.004929   0.022607       b [ Adig Agra Agre Amad Arub ]
8          0.004813   0.011312       c [ Adig Agra Amad Arub ]
9          0.004208   0.010334       d [ Agra Amad Arub ]
10         0.002800   0.007608       y [ Amad Arub ]
11         0.003533   0.009769       e [ Agra Amad ]
12         0.002276   0.006919       z [ Amad ]
13         0.002276   0.009404       z [ Amad ]

Species tree network:
(Smic, (Agre,(Adig,((z[&phi=0.294449,tau-parent=yes],Arub)y,(Agra,(Amad)z[&phi=0.705551,tau-parent=yes])e)d)c)b)a;

Species tree network with attributes (thetas and tau 95% HPD), and branch lengths (tau difference):
(Smic: 0.012937, (Agre: 0.004929, (Adig: 0.004813, ((z: 0.000525, Arub: 0.002800)y[&height_95%_HPD={0.00205200, 0.00354500}, theta=0.0076080]: 0.001407, (Agra: 0.003533, (Amad: 0.002276)z[&height_95%_HPD={0.00070900, 0.00318600}, theta=0.0069192]: 0.001257)e[&height_95%_HPD={0.00299800, 0.00405500}, theta=0.0097689]: 0.000675)d[&height_95%_HPD={0.00370400, 0.00467900}, theta=0.0103340]: 0.000605)c[&height_95%_HPD={0.00450200, 0.00512400}, theta=0.0113123]: 0.000116)b[&height_95%_HPD={0.00459700, 0.00525900}, theta=0.0226065]: 0.008008)a[&height_95%_HPD={0.01153200, 0.01436100}, theta=0.0347010];

Species tree network with taus and thetas:
(Smic , (Agre , (Adig #0.006046, ((z[&phi=0.294449,tau-parent=yes]:0.002276 #0.009404, Arub #0.006775)y:0.002800 #0.007608, (Agra #0.005427, (Amad #0.004796)z[&phi=0.705551,tau-parent=yes]:0.002276 #0.006919)e:0.003533 #0.009769)d:0.004208 #0.010334)c:0.004813 #0.011312)b:0.004929 #0.022607)a:0.012937 #0.034701;

Summarizing parameter estimates using file baobab-msci.conditional_a1b1.txt ...

  param       mean      S.D       2.5%     97.5%    2.5%HPD   97.5%HPD    Effu      Effy       c    
----------------------------------------------------------------------------------------------------
   theta:1  0.006038  0.000687  0.004837  0.007521  0.004754  0.007411  3.721156  0.862468  9.151633
   theta:2  0.005413  0.000669  0.004249  0.006860  0.004131  0.006715  2.592236  0.819976  6.016131
   theta:4  0.004771  0.001707  0.003358  0.009691  0.002952  0.007742  0.003508  0.003505  1.282471
   theta:5  0.006752  0.001277  0.005051  0.009301  0.004796  0.008791  0.025967  0.025644  1.938617
   theta:7  0.034696  0.002804  0.029502  0.040473  0.029143  0.040092  0.033224  0.032448  3.576019
   theta:8  0.022604  0.001579  0.019655  0.025843  0.019554  0.025717  0.172468  0.155356  2.767326
   theta:9  0.011051  0.005409  0.003154  0.023879  0.002010  0.021558  0.789998  0.647090  1.388027
  theta:10  0.010302  0.003967  0.003677  0.019063  0.003022  0.018069  0.040915  0.040676  1.167315
  theta:11  0.007530  0.003029  0.003527  0.015157  0.002819  0.013586  0.011881  0.011849  1.296254
  theta:12  0.009600  0.004234  0.003198  0.019543  0.002295  0.017872  0.335981  0.308929  1.352500
  theta:13  0.006488  0.004404  0.001434  0.017534  0.000711  0.014979  0.036359  0.035837  1.668170
  theta:14  0.008859  0.005126  0.002129  0.021449  0.000984  0.018307  0.020153  0.019832  5.099156
phi:13<-12  0.705658  0.191303  0.109500  0.951500  0.306500  0.999500  0.002275  0.002274  1.036804

Once the run is finished, there will be five output files:

  1. baobab-msci.txt: main outpu file containing the above information
  2. baobab-msci.mcmc.txt: MCMC sample of model parameters
  3. baobab-msci.conditional_a1b1.txt: MCMC sample of $a$ and $b$ parameters of the gamma distributions in Gibbs sampling, used for generating final posterior summaries.
  4. baobab-msci.FakeTree.tre: fitted model, which can be visualized in FigTree; see figure below
  5. baobab-msci.SeedUsed: a random seed used by BPP

Convergence

Try completing two runs and assess the convergence. A helpful tool is Tracer, but BPP provides some helpful summary statistics too. Here, we provide a python script (scatter-params.py) to help plot posteriors for theta and tau parameters. If we are satisfied, we can use one MCMC analysis from BPP or combine them to report parameter estimates. The plotting script can be executed from the command line with

# download script
wget https://raw.githubusercontent.com/bpp/bpp-tutorial-geneflow/main/third-day/scripts/scatter-params.py

# add execute permissions
chmod +x scatter-params.py

# run script on your two mcmc files
./scatter-params.py <mcmc1.txt> <mcmc2.txt>

The script will produce three plots, one for thetas (thetas.pdf), one for taus (taus.pdf) and one for phis (phis.pdf).

The below plot shows a longer run

Sample result files

Below are some results from our runs.

# Long run Short run
Output Tarball Output Tarball
1 out.long.A00.msci.r1.txt MSCI-long.r1.tar.gz out.short.A00.msci.r1.txt MSCI-short-r1.tar.gz
2 out.long.A00.msci.r2.txt MSCI-long-r2.tar.gz out.short.msci.A00.r2.txt MSCI-short-r2.tar.gz

Additional option: Theta models

It is possible to constrain theta parameters in the model using the keyword thetamodel. For example, we can force theta to remain unchanged after introgression. See section 5.3 in the BPP manual

⚠️ **GitHub.com Fallback** ⚠️