PartitionFinder on OpenStack - cdoorenweerd/PhylOStack GitHub Wiki

This HOWTO explains how to install and use the latest version of PartitionFinder from Github [on writing this manual v2.0.0pre13] on OpenStack with Ubuntu 14.04 LTS. For more information, see http://www.robertlanfear.com/partitionfinder/

Note: This HOWTO assumes you have installed the PhylOStack and know how to connect via SSH, transfer files and use screen sessions.

Preparing input data

Partitionfinder requires two files to run:

  1. an alignment file in relaxed phylip format, named alignment.phy in the example
  2. a configuration text file that must be named partition_finder.cfg

The config file should look something like this. Be sure that the defined name of the alignment exactly matches your alignment file name. If you have detailed information on the genes in your dataset, you probably want to define the separate codon positions and let PF calculate the best combinations. For genomic datasets you may not have such information, and you can use a k-means search without any prior information on the dataset. For more details on how to set everything correctly, refer to the manual for version 2. It is available in .pdf and .docx format in the partitionfinder/docs folder.

# ALIGNMENT FILE #
alignment = alignment.phy;

# BRANCHLENGTHS: linked | unlinked #
branchlengths = linked;

# MODELS OF EVOLUTION: all | allx | raxml | mrbayes | beast | <list> #
models = GTR+G;

# MODEL SELECTION: aicc | bic #
model_selection = aicc;

# DATA BLOCKS #
# Put all data in one block for a kmeans search scheme
[data_blocks]
28SNep_921 = 1-921;
CAD2_415 = 922-1336;
COI-5P_codon12 = 1338-2001\3 1339-2001\3;
COI-5P_codon3 = 1337-2001\3;
COII_636 = 2002-2637;
EF1-alpha_Nep = 2638-3119;
Histon3 = 3120-3447;
IDH_723 = 3448-4170;
MDH1 = 4171-4576;

# SCHEMES: all | greedy | rcluster | hcluster | kmeans #
# min-subset-size does not work with greedy or all
[schemes]
search = rcluster;

Running PartitionFinder

Place both input files in a folder and copy this folder to the instance. SSH into the instance, navigate to the folder, start a Screen session (if desired) and start partitionfinder. By default partitionfinder uses PhyML to build trees. Alternatively, you can use RaxML. This is faster and can cope with phylogenomic datasets, but only tests GTR, GTR+G and GTR+I+G models. The min-subset-size setting can be used to set a minimum number of required states per subset, preventing overly fine partitioning, but be ignored with 'greedy' or 'all' search schemes.

  • with PhyML
partitionfinder . --min-subset-size 250 && grep 'DNA, Subset' analysis/best_scheme.txt > partitions.txt && tr '\\' '/' < partitions.txt > exabayespartitions.txt
  • with RAxML
partitionfinder . --raxml --min-subset-size 250 && grep 'DNA, Subset' analysis/best_scheme.txt > partitions.txt && tr '\\' '/' < partitions.txt > exabayespartitions.txt

There will be a log.txt automatically created in the folder with the input files that you can tail. When the run is finished, the best partitioning scheme will be in inputfilesfolder/analysis/best_scheme.txt. The grep part of the command outputs a file with the partitions in RAxML format in the inputfiles directory named partitions.txt

⚠️ **GitHub.com Fallback** ⚠️