PartitionFinder on OpenStack - cdoorenweerd/PhylOStack GitHub Wiki
This HOWTO explains how to install and use the latest version of PartitionFinder from Github [on writing this manual v2.0.0pre13] on OpenStack with Ubuntu 14.04 LTS. For more information, see http://www.robertlanfear.com/partitionfinder/
Note: This HOWTO assumes you have installed the PhylOStack and know how to connect via SSH, transfer files and use screen sessions.
Partitionfinder requires two files to run:
- an alignment file in relaxed phylip format, named
alignment.phyin the example - a configuration text file that must be named
partition_finder.cfg
The config file should look something like this. Be sure that the defined name of the alignment exactly matches your alignment file name. If you have detailed information on the genes in your dataset, you probably want to define the separate codon positions and let PF calculate the best combinations. For genomic datasets you may not have such information, and you can use a k-means search without any prior information on the dataset. For more details on how to set everything correctly, refer to the manual for version 2. It is available in .pdf and .docx format in the partitionfinder/docs folder.
# ALIGNMENT FILE #
alignment = alignment.phy;
# BRANCHLENGTHS: linked | unlinked #
branchlengths = linked;
# MODELS OF EVOLUTION: all | allx | raxml | mrbayes | beast | <list> #
models = GTR+G;
# MODEL SELECTION: aicc | bic #
model_selection = aicc;
# DATA BLOCKS #
# Put all data in one block for a kmeans search scheme
[data_blocks]
28SNep_921 = 1-921;
CAD2_415 = 922-1336;
COI-5P_codon12 = 1338-2001\3 1339-2001\3;
COI-5P_codon3 = 1337-2001\3;
COII_636 = 2002-2637;
EF1-alpha_Nep = 2638-3119;
Histon3 = 3120-3447;
IDH_723 = 3448-4170;
MDH1 = 4171-4576;
# SCHEMES: all | greedy | rcluster | hcluster | kmeans #
# min-subset-size does not work with greedy or all
[schemes]
search = rcluster;Place both input files in a folder and copy this folder to the instance. SSH into the instance, navigate to the folder, start a Screen session (if desired) and start partitionfinder. By default partitionfinder uses PhyML to build trees. Alternatively, you can use RaxML. This is faster and can cope with phylogenomic datasets, but only tests GTR, GTR+G and GTR+I+G models. The min-subset-size setting can be used to set a minimum number of required states per subset, preventing overly fine partitioning, but be ignored with 'greedy' or 'all' search schemes.
- with PhyML
partitionfinder . --min-subset-size 250 && grep 'DNA, Subset' analysis/best_scheme.txt > partitions.txt && tr '\\' '/' < partitions.txt > exabayespartitions.txt- with RAxML
partitionfinder . --raxml --min-subset-size 250 && grep 'DNA, Subset' analysis/best_scheme.txt > partitions.txt && tr '\\' '/' < partitions.txt > exabayespartitions.txtThere will be a log.txt automatically created in the folder with the input files that you can tail. When the run is finished, the best partitioning scheme will be in inputfilesfolder/analysis/best_scheme.txt. The grep part of the command outputs a file with the partitions in RAxML format in the inputfiles directory named partitions.txt