Gene tree species tree reconciliation - Tancata/phylo GitHub Wiki

Gene tree/species tree reconciliation with ALE

Align, trim lots of files at once:

ls *fa | parallel muscle -in {} -out {.}.aln

ls *aln | parallel trimAl -in {} -automated1 -phylip -out {.}.phy

Build ML trees for all gene family alignments. Use the *ufboot files for reconciliation.

iqtree -s alignmentFile -m TEST -bb 1000 -wbtl

or do many at once:

ls *phy | parallel iqtree -s {} -m TEST -bb 1000 -wbtl

Convert Newick bootstrapped files to ALE files, run undated reconciliation:

ls *ufboot | parallel ALEobserve {}
ls *ale | parallel ALEml_undated speciesTree {}

To compare gene family likelihoods under a set of rooted species trees, you can have the reconciliation files in a number of different directories --- each named according to a rooting hypothesis. Then:

write_consel_file.py dir1 dir2 dir3 (etc) > myfile.mt

The output can be used as a .mt file for CONSEL (NOTE: The order of the rooting hypotheses in myfile.mt is arbitrary, and may not be the same as the order in which the arguments appear on the input line. So check the file, and reorder the lines if you wish). For example:

makermt myfile

consel myfile

catpv myfile

Pull out relevant data from the reconciliation files for branch-wise analysis:

build_DTL_table.py geneTreeDir/ > table.txt

Do some analysis :-)

Useful things

Check gene trees contain only species tree names:

python check_valid_species_names.py speciesTreeNewick geneTreeDir

Edit names if necessary (e.g. with edit_tree_names.py):

python edit_tree_names.py