SNP_Phylo - BGIGPD/BestPractices4Pathogenomics GitHub Wiki

Workshop: Using Malaria VCF Files to Construct an SNP Phylogenetic Tree

Objectives

This workshop aims to guide participants through the process of constructing a single nucleotide polymorphism (SNP) phylogenetic tree using VCF (Variant Call Format) files derived from malaria pathogen data. By the end of this session, participants will:

  • Generate an SNP phylogenetic tree to study genetic relationships among different malaria samples.

Steps

1. Prepare Your Environment

Install the softwares required using conda

conda install -c conda-forge -c bioconda -n WGS_analysis iqtree
conda activate WGS_analysis

2. Create your work directory and link the result of vcf filtering to here

Create a new directory for the SNP phylogenetic construction

mkdir workshop_SNP_PhyloConstruction
cd workshop_SNP_PhyloConstruction

Make a symlink from the vcf you filtered to here

ln -s ~/workshop_WGS_upstream/VCF_filtering/Pf7_practice_chr1.filtered.snps.vcf ./

3. Convert VCF to TAB Format

conda activate /home/renzirui/micromamba/envs/WGS_analysis
cat Pf7_practice_chr1.filtered.snps.vcf | vcf-to-tab > Pf7_practice_chr1.filtered.snps.vcf.tab

4. Using custom perl script to convert the tab format into SNP alignment FASTA

perl /home/renzirui/workshop_SNP_PhyloConstruction/vcf_tab_to_fasta_alignment.pl -i Pf7_practice_chr1.filtered.snps.vcf.tab > Pf7_practice_chr1.filtered.snps.afa

5. Construct the Phylogenetic Tree

We will use IQ-TREE to build the SNP phylogenetic tree.

iqtree -s Pf7_practice_chr1.filtered.snps.afa -m MFP -bb 1000 -nt 4 -pre Pf7_SNP_Phylo/tree

6. Visualize the Tree