Dataset - Pas-Kapli/CoME-Tutorials GitHub Wiki

Data

The Lacertidae are the family of the wall lizards or also known as true lizards, or sometimes simply lacertas, which are native to Afro-Eurasia. It is a diverse family with about 360 species in 39 genera. They represent the most common reptile group in Europe.

There have been several attempts to unravel the relationships among the lacertid genera based on both, genetic, and morphological markers. However, most phylogenetic inference efforts to date yielded unresolved or conflicting topologies. The most problematic relationships in the family are those among the 19 Lacertini genera that mostly occur across Eurasia. All attempts to unravel the relationships among its genera resulted in topologies with small internal, and long external branches, thus, resembling a “bush”.

For our tutorial, we will use a subset of the Garcia-Porta et al, 2019 publication on the Lacertini relationships. They performed RNA-seq and identified orthologous sequences based on a previously compiled set of markers across vertebrates.

The total amount of loci was 6269 and the relevant phylogeny is shown in the Figure above in panel a. For practical reasons, we will use a subset of 50 loci for 20 samples corresponding to 18 species and 17 genera (compared to the original data shown in a in the figure above we have removed the two most distant outgroup Gallotia and Psammodromus and two problematic taxa Hellenolacerta and one of the Podarcis liolepis samples) .

Pre-processing

Prepare the folders that we will be using throughout the tutorial

mkdir lizard-exercise
cd lizard-exercise
mkdir data alignments phylo filtered_alignments


# Enter the "data" folder and download the compressed fasta files and uncompress
cd data 
wget https://github.com/Pas-Kapli/CoME-Tutorials/raw/refs/heads/main/tutorial1/fasta-files.tar.gz
tar -xvzf fasta-files.tar.gz
rm fasta-files.tar.gz

Use alan to display one of the alignments in the terminal:

alan locus_1.fasta

Or any other alignment viewer if you are working on your laptop (e.g. seaview)

You will notice that the sequences are not aligned! We need to align them before moving to phylogenetic inference.

Type q to exit the alan viewer.

Next, alignment and alignment filtering

Dataset | Alignment | Gene-Tree ML Inference